[SQL Anyhwere 12.01.3423]

I have a database server (dbsrv12.exe) running as a Windows service, which serves 64 databases. Recently I encountered a problem when trying to stop one of the databases using this procedure:

  1. run dbisqlc.exe
  2. connect to the server
  3. issue the command 'STOP DATABASE foo UNCONDITIONALLY' (note: there were connections to the database when this command was executed)

At this point the dbisqlc session appeared to hang so eventually I killed it. The database was left in a strange state where:

  • it was no longer possible to connect to the database
  • the database no longer appeared in the list of databases within the server when viewed in Sybase Central
  • the database no longer appeared in the output of sa_db_info()
  • but the DB file was still locked by the dbsrv12.exe process (checked by trying to rename the DB file and then verifying what was locking it via process explorer)

I then checked sa_conn_info() and could see that a single connection remained to the database I had attempted to stop i.e. there was a connection where sa_conn_info().dbnumber did not exist in the list of sa_db_info().number[s] (and I could confirm from the fact that we have a convention for connection names that this was the database that I had tried to drop).

I tried dropping this connection in dbisqlc using 'DROP CONNECTION' but received a 'permission denied' message. I then had a look in Sybase Central and could see the connection under the 'All connections' tab with an empty value in the 'Database' column. I tried right-clicking and then 'Disconnect' but then I was presented with a message along the lines of 'You are not connected to the database NULL' (I'm afraid I did not record the exact message but can remember the word 'NULL' appearing).

Finally I tried stopping the server service but this hung so I ended up killing the dbsrv12.exe process.

I was wondering if is a known issue? Also, if it happens again is there anything else I can do to try and get the database to finish shutting down? Failing that, is there anything else I can look at to help investigate the problem further?

asked 22 Nov '12, 07:40

Luke's gravatar image

Luke
496111525
accept rate: 33%

retagged 22 Nov '12, 09:05

Nica%20_SAP's gravatar image

Nica _SAP
866722


Hi Luke,

It sounds as if the database in question is getting into an hang condition that it can't break out of. Have you checked the database console window for any strange events that you aren't used to seeing? (e.g. Are checkpoints still happening on the database, are there any 'thread dispatch warnings', etc.?) Were there any other server events that coincided with when the hang was discovered? (e.g. backups, scheduled events, etc.?)

I tried dropping this connection in dbisqlc

Do you remember what connection number this was? Was it a really high number (over 10000000) or was it a reasonably low number?

I was wondering if is a known issue?

There are indeed known and fixed hanging bugs in 12.0.1 past your current build number (i.e. CR #720784, fixed in 12.0.1.3798 - the Windows x86/x64 12.0.1.3810 EBF is currently available). I would recommend trying the latest 12.0.1 EBF and seeing if the hanging condition is still reproducing for you.

Without having any further engine diagnostics, it would be difficult to guess what the actual database hanging issue is related to.

Failing that, is there anything else I can look at to help investigate the problem further?

Yes, capturing a (full) 'database server memory dump' from the dbsrv12 process while it is hung and providing this to our SAP Sybase engineers for analysis would be one of the best ways to figure out why the database is getting into this state. Supplying this dump file to technical support for analysis would be the second step for resolution - we would be able to tell you truly whether this is a fixed issue or if it is a previously unreported issue that requires an engine fix. If you are not aware of how to currently capture this information, we can provide instructions on how to capture this server memory dump information via the support case.

We also have internal tools in technical support that we can optionally provide to you in order to capture additional information from an unresponsive server.

permanent link

answered 23 Nov '12, 10:03

Jeff%20Albion's gravatar image

Jeff Albion
10.7k171174
accept rate: 24%

Jeff, thanks for the detailed response.

Checkpoints continued for the database after the STOP DATABASE command was executed. Everything else seemed pretty normal and there were no errors (including 'thread dispatch warning's) in the -o log.

The connection number was not exceptionally high, it was in the same range as the other user-based connections to the server.

Ironically, we have just finished testing 12.0.1.3797 for the next release of our software but I think I'll see if we can change this to the latest EBF now. If the issue then happens again on the newer version I'll go down the support case + memory dump route.

(23 Nov '12, 10:55) Luke

Jeff, I've just download the 12.0.1.3810 EBF but can't see CR#720784 mentioned in the release notes (http://origin1.sybase.com/swx/16291/rdme_1201_ebf_3810.html).

Do you know for sure that this fix is in 3810? If so, is there any detail available on what the change was? (I tried the link to the CR you posted above but it simply says it was fixed in 3798 without any explanation of what the problem or change was).

(23 Nov '12, 14:47) Luke
Replies hidden

Yes, it seems that the CR information has not been pushed to the public website and missed the README. Very strange.

Keep in mind that this is just one of the issues fixed that I used as a 'for instance'... there are other possible issues since 12.0.1.3423. As I mentioned above though, without further submitted diagnostics, we're purely guessing at the root cause.


CR: 720784

Versions affected: 12.x and up

Modules affected: server

Versions fixed: 12.0.1.3798

Description:

In rare cases, run-time errors (such as overflow or conversion errors) detected during the execution of parallel query plans could cause the server to hang. A workaround is to disable intra-query parallelism for the affected queries (i.e. set option MAX_QUERY_TASKS=1 for the affected query/connection).

(26 Nov '12, 08:34) Jeff Albion

Thanks for the clarification.

(26 Nov '12, 12:24) Luke

Could the database be trying to rollback a long transaction from one of the users unceremoniously disconnected by the STOP DATABASE foo UNCONDITIONALLY statement?

permanent link

answered 23 Nov '12, 05:18

Justin%20Willey's gravatar image

Justin Willey
6.5k105135203
accept rate: 21%

I suppose this is possible but I had left it in this state for around 10 hours before I resorted to killing the server. It's not a particularly large database (214MB file size) and the client software generally doesn't generate long transactions so I think the problem is likely to lie elsewhere.

(23 Nov '12, 06:47) Luke
Your answer
toggle preview

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text](http://url.com/ "title")
  • image?![alt text](/path/img.jpg "title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported

Question tags:

×406
×25
×3

question asked: 22 Nov '12, 07:40

question was seen: 2,034 times

last updated: 26 Nov '12, 12:24