Java OutOfMemoryError: unable to create new native thread

Another strange problem raised last week at work that finally I’ve solved. To be honest, the problem is not so strange, but the way it showed up and the errors we got were. As most of the problems, when you discover the roots or the reasons, they stop being rare and you think – “It’s obvious!” :-).

Background

I like to be concise when explaining this kind of things, so to sum up, we have a Java server that acts as an authenticator for other applications. This authentication server uses the javax.security.auth.login Java package to connect and authenticate through an LDAP server.

Besides, an intesive process tries to get information from another application through a web service that needs authentication. To get authenticated, this process uses the authentication server described above but after 300 hundred calls or so in one or two minutes we start seeing this Java exception stacktrace:

04:13:07.682 [] ERROR com.xxxxx.xxxxx.server.services.XXXXXXXXServiceHelper.login(): 
Authentication error : java.lang.OutOfMemoryError: unable to create new native thread
at java.lang.Thread.start0(Native Method)
at java.lang.Thread.start(Thread.java:691)
at com.sun.jndi.ldap.Connection.(Connection.java:231)
at com.sun.jndi.ldap.LdapClient.(LdapClient.java:136)
at com.sun.jndi.ldap.LdapClient.getInstance(LdapClient.java:1600)
at com.sun.jndi.ldap.LdapCtx.connect(LdapCtx.java:2698)
at com.sun.jndi.ldap.LdapCtx.(LdapCtx.java:316)
at com.sun.jndi.ldap.LdapCtxFactory.getUsingURL(LdapCtxFactory.java:193)
at com.sun.jndi.ldap.LdapCtxFactory.getUsingURLs(LdapCtxFactory.java:211)
at com.sun.jndi.ldap.LdapCtxFactory.getLdapCtxInstance(LdapCtxFactory.java:154)
at com.sun.jndi.ldap.LdapCtxFactory.getInitialContext(LdapCtxFactory.java:84)
at javax.naming.spi.NamingManager.getInitialContext(NamingManager.java:684)
at javax.naming.InitialContext.getDefaultInitCtx(InitialContext.java:307)
at javax.naming.InitialContext.init(InitialContext.java:242)
at javax.naming.ldap.InitialLdapContext.(InitialLdapContext.java:153)
at com.xxxxx.xxxxx.server.ldap.LdapConnection.open(LdapConnection.java:115)
at com.xxxxx.xxxxx.server.ldap.LdapConnection.open(LdapConnection.java:97)
at com.xxxxx.xxxxx.server.ldap.LdapLoginModule.attemptAuthentication(LdapLoginModule.java:325)
at com.xxxxx.xxxxx.server.ldap.LdapLoginModule.login(LdapLoginModule.java:175)
at sun.reflect.GeneratedMethodAccessor10194.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at javax.security.auth.login.LoginContext.invoke(LoginContext.java:784)
at javax.security.auth.login.LoginContext.access$000(LoginContext.java:203)
at javax.security.auth.login.LoginContext$4.run(LoginContext.java:698)
at javax.security.auth.login.LoginContext$4.run(LoginContext.java:696)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.login.LoginContext.invokePriv(LoginContext.java:695)
at javax.security.auth.login.LoginContext.login(LoginContext.java:594)

At the beginning people thought that our authentication server application was running out of heap memory… but it doesn’t, because it didn’t stop working. After 3 or 4 minutes it worked fine and no errors were shown until the next execution of the intensive process. Then, one of the development teams involved in this issue called me to help them investigating the problem.

I was sure it wasn’t a problem of the heap memory of the authentication server because the OutOfMemoryError was shown in a java log trace of the application, not the application server where it is deployed, so the application worked… not really fine, but worked!. Another possible cause I thought was that the error source was LDAP, but our LDAP is not Java based, either the log files gave any error.

The key

The key to this problem was in front of our eyes, the message associated to the OutOfMemoryError: unable to create new native thread

The solution

After reading a bit on several pages, it’s possible that this problem could be solved using different solutions. But it’s sure it can’t be solved increasing the heap size of the JVM memory, because the JVM heap memory was fine.

I’m not a Linux system administration, but after investigating a bit, I discovered that it was the machine, where the application server with our authentication server was installed, which was running out of resources for creating system operating threads. Our system administrator realized that the Linux machine had a default configuration, so the running processes limit and the file descriptors limit was too low. Finally I asked the system administrator to increase those numbers. It can be done modifying the values for nofile and nproc from the file /etc/security/limits.conf:

#        - nofile - max number of open files
#        - nproc - max number of processes

The safest way would be limiting the values only for the user involved, not the whole machine, but that’s must be a system administration decision.

References

VirtualBox error “VT-x is disabled in the BIOS. (VERR_VMX_MSR_VMXON_DISABLED)”

After updating my Fedora 19 x64 I tried to open a VM I have installed on it but received this error:

VT-x is disabled in the BIOS. (VERR_VMX_MSR_VMXON_DISABLED)

It is supposed that I had to enable some VT-x configuration in the BIOS, but I’ve never touch the configuration before or after the update, so that couldn’t be the problem.
The current VirtualBox version installed is 4.3.4 and kernel is 3.11.10.

If you are in the same situation as me (you can’t change or you don’t want to change the BIOS config), what you can do is touch the VM. First, get the name and UUID of the VM you want to fix:

VBoxManage list vms
"My Guest VM Name" {e6b08efd-0453-497b-b934-ff8ad17baad3}

This gives you the VM’s name and UUID. Then turn off the long mode flag:

VBoxManage modifyvm "My Guest VM Name" --longmode off

Tha bad side of this is that I don’t know why it got broken after the update, the good thing is that I can continue working.

Gnome3 alacarte on Fedora 16

With the release of Fedora 16 there have been some problems. One of them is that alacarte has been broken temporaly, when you try to execute it, you get an error:

/usr/lib/python2.7/site-packages/Alacarte/Mainwindow.py line 19 in
Import gtk, gmemu, gio
ImportError No Module named gmenu

Well, it has an easy solution. You have to download gnome-menus-3.0.1-1.fc15.x86_64.rpm or gnome-menus-3.0.1-1.fc15.i686.rpm depending upon your PC architecture (64 or 32 bits).

Once you have the package, copy it in a temporary directory and extract the content:

rpm2cpio gnome-menus-3.0.1-1.fc15.x86_64.rpm | cpio -ivd

Finally, copy (as root) this file, which was into the .rpm file, at the same path:

cp tmp/usr/lib64/libgnome-menu.so.2.4.13 /usr/lib64/libgnome-menu.so.2.4.13

Make a symbolic link within /usr/lib64/:

ln -s libgnome-menu.so.2.4.13 libgnome-menu.so.2

And copy this file also in the same path:

cp tmp/usr/lib64/python2.7/site-packages/gmenu.so /usr/lib64/python2.7/site-packages

If your system is a 32bits PC, then change lib64 by lib.
Now you’re able to run alacarte.

However, it doesn’t fix the problem of editing menus in Gnome 3, because with alacarte you can edit menus, but they don’t correspond with what you see in Applications view… that’s another mistery to solve…