We've run into an issue all of the sudden where a database instance on one of our servers keeps going offline. We are unable to identify the cause of it but have noticed that when we kill off all the semaphores it continues to run without problems even though the load on the system increases. We are heading into our peak season in a few weeks and the load on the db server is going to increase dramatically. It would be nice to identify the cause of the problem before that happens.
We are planning to update it to ASA9 running in ASA7 compatibility mode and that in and of itself might fix the issue but identifying the cause now would be in our best interests.
Other details: Database server has 2 instances on it - the problem instance contains a 9.3 gb shared database - the other instance contains 4 small databases that have never gone offline.
kernel.shmmax = 268435456 kernel.sem = 250 32000 32 256
System is a VMware host with 2 cpus and 12 gb of RAM
If you delete all of the semaphores used by the database server (and/or clients) then there was a bug a long long time ago where the server would spin in a tight loop and consume lots of CPU. A change was made a long long time ago - Nov. 2003 - that (I believe) went into ASA 9.0.0 so that if the server's semaphore is deleted the server will now shutdown with an error stating such. A related issue that would cause the server to crash when the server's semaphore was deleted was fixed in ASA 9.0.1 build 2020 and 9.0.2 build 3120 (QTS 390341).
So I suspect that you have it this issue in ASA 7. When the semaphore is deleted it spins and consumes a lot of CPU but will continue to process requests (but slowly).
Please note that ASA 7 is long ago EOL'ed. ASA 9 is also EOL.
If you must stick to 9.0.x then I would recommend switching to the lastest available 9.0.2 EBF. Your problem will likely be resolved.... provided you don't delete the server's (and clients') semaphore. If the semaphore is being deleted by some process on your system then you need to try to track down which process is doing this. E.g. Check your other applications on the system that are using "ipcrm".
answered 18 Jul '11, 18:04