Discussion:
Cipher algorithm AES/GCM/NoPadding performing inconsistently across two identically configured cluster nodes.
Timothy Akujuaobi
2014-08-20 12:16:01 UTC
Permalink
Hi.

I have several boxes set up in a cluster environment running tomcat 6,
openjdk-1.6.0.0.x86_64 and bouncycastle bcprov-jdk15-1.45.
I am using AES/GCM/NoPadding cipher algorithm, and bouncycastle provider is
loaded from within the application classpath.
The bcprov-jdk15-1.45.jar is bundled in 3 different jar files on the
classpath - one shaded, and the other two not.
The problem is that I have some data that when encrypted on one cluster
node intermittently fails to decrypt a second node with the exception
details below

Caused by: javax.crypto.BadPaddingException: mac check in GCM failed
at org.bouncycastle.jce.provider.JCEBlockCipher.engineDoFinal(Unknown
Source) ~[bcprov-jdk15-1.45.jar:1.45.0]
at javax.crypto.Cipher.doFinal(DashoA13*..) ~[na:1.6]

Usually, this behaviour sometimes gets triggered by a deployment into the
server instance running the application.
Whenever this failure occurs, the encrypted data generated on either node
fails to decrypt on the other.
However, each node can successfully decrypt its own encrypted data while in
this state.

I have verified the AES key and initialisation vector used are the same
across the inconsistent nodes.
I have also taken heap dumps off two nodes in this 'incompatible' state,
and the environment and system properties as well as application states are
consistent as expected.

Please does anyone have any suggestions as to what could be triggering this
behaviour? Or at least any suggestions/pointers to help my investigation?

Thanks
Tim Whittington
2014-08-20 23:24:51 UTC
Permalink
Hi.
I have several boxes set up in a cluster environment running tomcat 6, openjdk-1.6.0.0.x86_64 and bouncycastle bcprov-jdk15-1.45.
This is a really old version of Java - I would try the latest of the 1.6 series (or 1.7/1.8 if you can) to see if this is a JVM issue.
I am using AES/GCM/NoPadding cipher algorithm, and bouncycastle provider is loaded from within the application classpath.
The bcprov-jdk15-1.45.jar is bundled in 3 different jar files on the classpath - one shaded, and the other two not.
The problem is that I have some data that when encrypted on one cluster node intermittently fails to decrypt a second node with the exception details below
Caused by: javax.crypto.BadPaddingException: mac check in GCM failed
at org.bouncycastle.jce.provider.JCEBlockCipher.engineDoFinal(Unknown Source) ~[bcprov-jdk15-1.45.jar:1.45.0]
at javax.crypto.Cipher.doFinal(DashoA13*..) ~[na:1.6]
Usually, this behaviour sometimes gets triggered by a deployment into the server instance running the application.
Whenever this failure occurs, the encrypted data generated on either node fails to decrypt on the other.
However, each node can successfully decrypt its own encrypted data while in this state.
I’ve seen something similar on HP-UX with Blowfish encryption - in that case it was a JIT bug, which caused the core encrypt methods to change behaviour when they were compiled by the JIT. Behaviour was consistent on the same machine, but was broken when communicating with a JVM on another platform.
Given how old the version of Java you have is, it’s a possibility you’re also encountering a JIT bug.

cheers
tim
Timothy Akujuaobi
2014-08-26 15:32:02 UTC
Permalink
Hi Tim,

Thanks for your earlier response.

I took your advice to investigate whether the encryption/decryption issues
I was experiencing was related to the JIT bug you suggested.
I was able to replicate this issue consistently by running my tests under
significant load.

I ran my tests again with the JIT disabled, and couldn't replicate the
failures anymore albeit the test took days to complete because of the lack
of optimization.

I also ran the same tests on newly configured local tomcat instances setup
to mirror the server environment, but with the latest release of openjdk
1.6 [jdk6-b32] and JIT compliation enabled, and could not replicate this
issue as well.

A Java upgrade therefore looks to have solved my problems.

Thanks for your steer.

regards,
Tim
Post by Timothy Akujuaobi
Post by Timothy Akujuaobi
Hi.
I have several boxes set up in a cluster environment running tomcat 6,
openjdk-1.6.0.0.x86_64 and bouncycastle bcprov-jdk15-1.45.
This is a really old version of Java - I would try the latest of the 1.6
series (or 1.7/1.8 if you can) to see if this is a JVM issue.
Post by Timothy Akujuaobi
I am using AES/GCM/NoPadding cipher algorithm, and bouncycastle provider
is loaded from within the application classpath.
Post by Timothy Akujuaobi
The bcprov-jdk15-1.45.jar is bundled in 3 different jar files on the
classpath - one shaded, and the other two not.
Post by Timothy Akujuaobi
The problem is that I have some data that when encrypted on one cluster
node intermittently fails to decrypt a second node with the exception
details below
Post by Timothy Akujuaobi
Caused by: javax.crypto.BadPaddingException: mac check in GCM failed
at org.bouncycastle.jce.provider.JCEBlockCipher.engineDoFinal(Unknown
Source) ~[bcprov-jdk15-1.45.jar:1.45.0]
Post by Timothy Akujuaobi
at javax.crypto.Cipher.doFinal(DashoA13*..) ~[na:1.6]
Usually, this behaviour sometimes gets triggered by a deployment into
the server instance running the application.
Post by Timothy Akujuaobi
Whenever this failure occurs, the encrypted data generated on either
node fails to decrypt on the other.
Post by Timothy Akujuaobi
However, each node can successfully decrypt its own encrypted data while
in this state.
I’ve seen something similar on HP-UX with Blowfish encryption - in that
case it was a JIT bug, which caused the core encrypt methods to change
behaviour when they were compiled by the JIT. Behaviour was consistent on
the same machine, but was broken when communicating with a JVM on another
platform.
Given how old the version of Java you have is, it’s a possibility you’re
also encountering a JIT bug.
cheers
tim
Loading...