SmartOS: Serious issue with openjdk 17 and TLS
This post transcribes some emails I sent to several mailing lists:
The initial email was sent to the Kafka User mailing list:
Reply-To: users@kafka.apache.org Message-ID: <92aa6286-d875-04ed-6205-9e4a8fe3a1a1@jcea.es> Date: Thu, 9 Nov 2023 03:37:39 +0100 To: users@kafka.apache.org From: Jesus Cea <jcea@jcea.es> Subject: How to get a X509 broker certificate with "openssl s_client"? I am trying to remotely access to the brokers certificates (for audit purposes, expiration alarms, etc) using this command: """ openssl s_client -showcerts -connect localhost:9092 """ The connection is correctly established, but something is wrong. The TLS session is has some errors at the beginning, but it success at the end: """ [jcea@Kafka ~]$ openssl s_client -showcerts -connect localhost:9092 CONNECTED(00000004) 1:error:1408F119:SSL routines:ssl3_get_record:decryption failed or bad record mac:ssl/record/ssl3_record.c:676: --- no peer certificate available --- No client certificate CA names sent Server Temp Key: X25519, 253 bits --- SSL handshake has read 1696 bytes and written 300 bytes Verification: OK --- New, TLSv1.3, Cipher is TLS_AES_256_GCM_SHA384 Secure Renegotiation IS NOT supported Compression: NONE Expansion: NONE No ALPN negotiated Early data was not sent Verify return code: 0 (ok) --- """ I tried too writing a tiny TLS client in Python, same result, raising this exception: "ssl.SSLError: [SSL: DECRYPTION_FAILED_OR_BAD_RECORD_MAC] decryption failed or bad record mac (_ssl.c:992)". I guess there is some kind of preamble before TLS negotiation. Is that documented somewhere?. How can I check remotely the brokers' certificates? Thanks.
This message never had any kind of answer.
I tried to debug the issue myself. It was strange: My Kafka brokers running under SmartOS communicate just fine with each other, but they were not able to interconnect with Linux Kafka brokers or even just accept regular TLS connections.
After a while it was apparent that the issue was OpenJDK 17 running under SmartOS. I sent this email to the SmartOS mailing list:
Message-ID: <7a529ea6-1c24-56ff-7dbc-40fed022cc40@jcea.es> Date: Tue, 5 Dec 2023 00:16:51 +0100 To: smartos-discuss <smartos-discuss@lists.smartos.org> From: Jesus Cea <jcea@jcea.es> List-Id: "smartos-discuss" <smartos-discuss.lists.smartos.org> Subject: [smartos-discuss] **CRITICAL:** Serious issue with openjdk 17 and TLS After many hours of trying to debug openjdk 17 SSL bizarre behaviour I have determined that TLS OpenJDK 17 (under SmartOS) is not actually TLS compatible. OpenJDK 17 programs can interact using TLS, but that TLS is not actually TLS. It is something else. I have written a minimal test to check this and discard OpenSSL or Kafka (the software I discovered this issue with) bugs. This is a OpenJDK 17 malfunction as distributed for SmartOS for image base-64-lts 22.4.1. Running this test under OpenJDK 8 or OpenJDK 11 works fine (same SmartOS image). I don't know if the mailing list admits attachments. This one is only a couple of kbytes long. I attach the testcase. It is small, so you can/should inspect it before running it on your machine. Steps to reproduce: 1. Install "gmake" and "openjdk17" via "pkgin". 2. Unpack the attached file. 3. Do "make keystore" to create a self-signed certificate for the purpose of the test. 4. Do "make run_server". This will run a Java TLS test server on 127.0.0.1, port 32345. Now you can try to connect to 127.0.0.1:32345 with a TLS client. The connection will fail when running the server with OpenJDK 17, but trying under version 8 or 11 will work. You could do "make run_client", for instance, or try "openssl s_client -connect 127.0.0.1:32345". When running under OpenJDK 8 or 11, you will get a correct TLS handshake, server certificate and "Hola mundo!" reply. When running under OpenJDK you will get bizarre errors, no certificate and an aborted connection because the handshake is incorrect. Interestingly, if both end are running code under OpenJDK 17, this works, but they can not interoperate with other "correct" TLS implementations. So, OpenJDK 17 under SmartOS, with image base-64-lts 22.4.1 is actually implementing its own incorrect TLS handshake. I think this issue is critical and deserves serious scrutiny with people with more knowledge and leverage than me. PS: Trying OpenJDK 17 on linux, it works fine. This is an specific issue with OpenJDK 17 and SmartOS, as currently distributed via "pkgin". PS: I have not tried image base-64-trunk. PS: In the test package I have included an java client too. You can try to connect to a HTTPS server and notice that it can't be done under OpenJDK 17. If you configure the client to connect to the java test server, both running under OpenJDK 17, it will work. So TLS implementation is broken and incompatible, but it works when both ends are broken in the same way (both running OpenJDK 17).
The attached source code was this:
-
File Makefile:
Makefile-20231231 (Código fuente)
JAVAC=javac JAVA=java SERVER=127.0.0.1 PORT=32345 all: SimpleTLSServer SimpleTLSServer: SimpleTLSServer.java $(JAVAC) SimpleTLSServer.java keystore: keytool -genkeypair -alias myalias -keyalg RSA -keysize 2048 -keystore keystore.p12 -storetype PKCS12 -storepass "tmztmz" clean: $(RM) *.class run_server: SimpleTLSServer $(JAVA) SimpleTLSServer run_client: openssl s_client -connect $(SERVER):$(PORT)
-
File SimpleTLSCLient.java:
SimpleTLSClient.java (Código fuente)
import javax.net.ssl.*; import java.io.*; import java.security.KeyStore; public class SimpleTLSClient { public static void main(String[] args) { if (args.length < 2) { System.out.println("Usage: java SimpleTLSClient <ip_address> <port>"); return; } String ipAddress = args[0]; int port = Integer.parseInt(args[1]); try { // Create the SSL context SSLContext sslContext = SSLContext.getInstance("TLS"); // Load the trust store KeyStore trustStore = KeyStore.getInstance(KeyStore.getDefaultType()); trustStore.load(new FileInputStream("/home/z/test_java_tls/keystore.p12"), "tmztmz".toCharArray()); // Create the trust manager factory TrustManagerFactory trustManagerFactory = TrustManagerFactory.getInstance(TrustManagerFactory.getDefaultAlgorithm()); trustManagerFactory.init(trustStore); // Initialize the SSL context with the trust managers sslContext.init(null, trustManagerFactory.getTrustManagers(), null); // Create the client socket SSLSocketFactory sslSocketFactory = sslContext.getSocketFactory(); // Estos dos funcionan //SSLSocket sslSocket = (SSLSocket) sslSocketFactory.createSocket("localhost", 32345); //SSLSocket sslSocket = (SSLSocket) sslSocketFactory.createSocket("2a01:b5c0:1:2::200", 32345); // este contra servidor smartos no funciona //SSLSocket sslSocket = (SSLSocket) sslSocketFactory.createSocket("2a01:b5c0:1:1:10:c60b:a495:aa58", 32345); SSLSocket sslSocket = (SSLSocket) sslSocketFactory.createSocket(ipAddress, port); // Read the response BufferedReader in = new BufferedReader(new InputStreamReader(sslSocket.getInputStream())); System.out.println("Server response: " + in.readLine()); // Close the connection sslSocket.close(); } catch (Exception e) { e.printStackTrace(); } } }
-
File SimpleTLSServer.java:
SimpleTLSServer.java (Código fuente)
import javax.net.ssl.*; import java.io.*; import java.security.KeyStore; public class SimpleTLSServer { public static void main(String[] args) { try { // Set up the key store System.setProperty("javax.net.ssl.keyStore", "keystore.p12"); System.setProperty("javax.net.ssl.keyStoreType", "PKCS12"); System.setProperty("javax.net.ssl.keyStorePassword", "tmztmz"); // Load the keystore KeyStore keyStore = KeyStore.getInstance("PKCS12"); FileInputStream fileInputStream = new FileInputStream(System.getProperty("javax.net.ssl.keyStore")); keyStore.load(fileInputStream, System.getProperty("javax.net.ssl.keyStorePassword").toCharArray()); // Create the key manager factory KeyManagerFactory keyManagerFactory = KeyManagerFactory.getInstance(KeyManagerFactory.getDefaultAlgorithm()); keyManagerFactory.init(keyStore, System.getProperty("javax.net.ssl.keyStorePassword").toCharArray()); // Create the SSL context SSLContext sslContext = SSLContext.getInstance("TLS"); sslContext.init(keyManagerFactory.getKeyManagers(), null, null); // Create the server socket SSLServerSocketFactory sslServerSocketFactory = sslContext.getServerSocketFactory(); SSLServerSocket sslServerSocket = (SSLServerSocket) sslServerSocketFactory.createServerSocket(32345); // Wait for connections while (true) { SSLSocket sslSocket = (SSLSocket) sslServerSocket.accept(); // Send the response PrintWriter out = new PrintWriter(sslSocket.getOutputStream(), true); out.println("Hola mundo!"); // Close the connection sslSocket.close(); } } catch (Exception e) { e.printStackTrace(); } } }
The thread was interesting and useful. I will only copy here the most relevant messages.
Jonathan Perkin <jperkin@mnx.io> wrote:
Date: Tue, 5 Dec 2023 14:20:00 +0000 From: Jonathan Perkin <jperkin@mnx.io> To: smartos-discuss <smartos-discuss@lists.smartos.org> Subject: Re: [smartos-discuss] **CRITICAL:** Serious issue with openjdk 17 and TLS Message-ID: <ZW8xkGvJKVjBd9i1@mnx.io> References: <5E433AA7-3888-42AA-857A-08FBF4C8AB31@mnx.io> <E84ECA4D-2425-454F-8AAE-D8DF08AFF03D@jcea.es> <ZW8dhI9nU3Fq1z_C@mnx.io> List-Id: "smartos-discuss" <smartos-discuss.lists.smartos.org> * On 2023-12-05 at 12:57 GMT, Jonathan Perkin wrote: >* On 2023-12-05 at 11:31 GMT, Jesus Cea wrote: > >>No, I have only tried SmartOS and Linux. I am not an OmniOS user. > >I've discussed this with Peter Tribble who has done the vast majority >of the recent work to keep OpenJDK's up-to-date on illumos. He >doesn't see the same issue with his OpenJDK on Tribblix, and so I'm >currently going through all of our differences to see why it's failing >in our build. Ok, I found it. You'll need to edit this file: /opt/local/java/openjdk17/conf/security/sunpkcs11-solaris.cfg and change the section at the end so that it looks like this: disabledMechanisms =3D { CKM_DSA_KEY_PAIR_GEN CKM_MD5 CKM_SHA256 CKM_SHA384 CKM_SHA512 SecureRandom } i.e. add CKM_MD5 and CKM_SHA* to the list. I'll get this patched in the package, and possibly look at adding it directly to the illumos jdk patchset if it's likely to be needed for the foreseeable future.
Jasper Siepkes wrote:
To: smartos-discuss <smartos-discuss@lists.smartos.org> Subject: Re: [smartos-discuss] **CRITICAL:** Serious issue with openjdk 17 and TLS Message-Id: <17018571190.9C74df30.91713@composer.smartos.topicbox.com> References: <5E433AA7-3888-42AA-857A-08FBF4C8AB31@mnx.io> <E84ECA4D-2425-454F-8AAE-D8DF08AFF03D@jcea.es> <ZW8dhI9nU3Fq1z_C@mnx.io> <ZW8xkGvJKVjBd9i1@mnx.io> Date: Wed, 6 Dec 2023 05:05:19 -0500 List-Id: "smartos-discuss" <smartos-discuss.lists.smartos.org> From: "Jasper Siepkes via smartos-discuss" <smartos-discuss@lists.smartos.org> I think this the same issue I bumped into earlier: https://smartdatacenter.= topicbox.com/groups/sdc-discuss/T09724ab085cbde0d-M988193518ea2d51f3b6f95ff= /openjdk-17-tls-broken Basically I always run everything with "-Dsun.security.pkcs11.enable-solari= s=3Dfalse" which works around the problem. Unless you need some hardware PK= CS#11 store on illumos it should be safe. Should be noted I never tested th= e performance difference. Kind regards, Jasper (Siepkes)
Jonathan Perkin <jperkin@mnx.io> completed:
Date: Wed, 6 Dec 2023 14:07:20 +0000 From: Jonathan Perkin <jperkin@mnx.io> To: smartos-discuss <smartos-discuss@lists.smartos.org> Subject: Re: [smartos-discuss] **CRITICAL:** Serious issue with openjdk 17 and TLS Message-ID: <ZXCAGGpHxfez5QC0@mnx.io> References: <5E433AA7-3888-42AA-857A-08FBF4C8AB31@mnx.io> <E84ECA4D-2425-454F-8AAE-D8DF08AFF03D@jcea.es> <ZW8dhI9nU3Fq1z_C@mnx.io> <ZW8xkGvJKVjBd9i1@mnx.io> <17018571190.9C74df30.91713@composer.smartos.topicbox.com> List-Id: "smartos-discuss" <smartos-discuss.lists.smartos.org> * On 2023-12-06 at 10:06 GMT, Jasper Siepkes via smartos-discuss wrote: >I think this the same issue I bumped into earlier: https://smartdatacenter= .topicbox.com/groups/sdc-discuss/T09724ab085cbde0d-M988193518ea2d51f3b6f95f= f/openjdk-17-tls-broken > >Basically I always run everything with "-Dsun.security.pkcs11.enable-solar= is=3Dfalse" which works around the problem. Yeh I think it's the same issue. trunk now has 17.0.9 packages which fix it, as well as updating to the latest GA. I'll look at backporting to 2022Q4.
After that thread, I sent this message to the Kafka User mailing list:
List-Id: <users.kafka.apache.org> Message-ID: <b5059c6a-82c4-6cee-2c20-99b96ada0bbb@jcea.es> Date: Tue, 5 Dec 2023 03:11:07 +0100 Subject: Re: How to get a X509 broker certificate with "openssl s_client"? To: users@kafka.apache.org References: <92aa6286-d875-04ed-6205-9e4a8fe3a1a1@jcea.es> From: Jesus Cea <jcea@jcea.es> In-Reply-To: <92aa6286-d875-04ed-6205-9e4a8fe3a1a1@jcea.es> On 9/11/23 3:37, Jesus Cea wrote: > I am trying to remotely access to the brokers certificates (for audit > purposes, expiration alarms, etc) using this command: > > """ > openssl s_client -showcerts -connect localhost:9092 > """ > > The connection is correctly established, but something is wrong. The TLS > session is has some errors at the beginning, but it success at the end: This is actually a bug in OpenJDK 17 under SmartOS. More details: <https://smartos.topicbox.com/groups/smartos-discuss/Tfa6a653f74458e7f/critical-serious-issue-with-openjdk-17-and-tls>.
In my environment, I (temporaly) solved the problem using OpenJDK 11 when running under SmartOS. The issue was resolved in SmartOS release 23.4.0.