I'm trying to investigate a server suffering very occasional freeze-ups where the whole database becomes unresponsive. The customer has been resolving the issue by restarting the db engine, but with over 100 users this is hardly a long term solution. Occasionally the database frees itself up after some minutes, before a restart has been initiated. Trying to investigate at the time the freeze-up happens isn't possible as the system is, well, frozen and you can't connect by any method.

I have put on request level logging for SQL and BLOCKS in the hope that it may show if a long running process is blocking everything else. In the most recent freeze-up, the system was paused for around 13 minutes and then came back to life. This is reflected in the request log (viewed with sa_get_request_times() ) that shows a gap in activity at the relevant time (and lots of statements that all took approx 800,000 milliseconds - ie 13.5 minutes - to complete). What I can't see in that output is any data relating to blocking (or its absence) - the output from the procedure doesn't seem to have any relevant information. Nor does the only other procedure I am aware of for analysing request log data - sa_get_request_profile.

So in summary - having enabled request level logging of BLOCKS - how do I use it?

Many thanks.

v10.0.1.4310 (yes I know - but I don't see anything in v16 either)

asked 12 Jun '15, 12:43

Justin%20Willey's gravatar image

Justin Willey
6.6k106136205
accept rate: 21%

Version 9 had a history of non-responsiveness sometimes due to checkpoints and backups (and the checkpoints resulting from backups). If memory serves those issues went away with V10 but perhaps it's worth looking at the timing of checkpoints and non-responsiveness.

Does Foxhound show any interesting activity leading up to the periods of non-responsiveness?

(15 Jun '15, 14:50) Breck Carter
Replies hidden

I need to get 16 installed and then I can put FH on.

(17 Jun '15, 17:46) Justin Willey

After running the sa_get_request_times procedure there should be a global temporary table (owned by the DBO user) called satmp_request_block that contains the information you're looking for.

permanent link

answered 12 Jun '15, 13:28

Mikel%20Rychliski's gravatar image

Mikel Rychliski
2.1k1641
accept rate: 30%

1

Thanks Mikel - I've checked that output and the good news is that my issue isn't being caused by blocking - the bad news is I now have no clue as to what else might be doing it :(

I've added a note to DCX about that table as it isn't mentioned at all in the help documentation.

(12 Jun '15, 15:25) Justin Willey
Comment Text Removed
1

Are any of the statements that take 13 minutes to complete expected to take that long? If the multiprogramming level is low enough, all of the workers could have requests assigned to them (which would ‘block’ new requests until one of them completed).

What is the -gn value currently set to? When the server is running normally, what does SELECT PROPERTY('ActiveReq') return?

If you would like to troubleshoot the issue while the server is unresponsive, having a connection open with the dedicated_task option set to ‘On’ may help.

How many logical processors are being used by SQL Anywhere? Version 10 is end of support, and there are some issues regarding intra-query parallelism that have been fixed in 12 & 16, but not 10. If this was the case, setting max_query_tasks=1 may help.

(15 Jun '15, 09:19) Mikel Rychliski
Replies hidden
Comment Text Removed

...and please note that "ActiveReq" will not count worker threads used for proxy connections (or maybe for external procedures, don't know for v10 as that does not have external environments like v11 and above) - compare that other question.

Of course that will not matter if it's not an issue with the MPL at all.

(15 Jun '15, 14:10) Volker Barth

Thanks for following this up. Nothing should take anything like 13 minutes or even 13 seconds. There are 4 logical processors (it's a virtual machine), and there is no explicit -gn setting so it will presumably use the default of 20.

Causal checking of normal usage puts active tasks between 2 & 7 (average 2.4). There aren't any remote servers in use.

I've put an event in place to track the number of active tasks and I'll see if that sheds any light on the issue.

(In the meantime I'll see what I can do to accelerate the plan for moving to v16!)

(17 Jun '15, 17:45) Justin Willey

What I do see in the log is occasional "Internal warning: Request task x took y seconds", where Y can be up to 20 seconds. Is there any way of correlating these warnings with something in the request log?

(17 Jun '15, 17:47) Justin Willey

UPDATE - logging for a day (where no freeze up happened) gave a max of 9 active tasks

(18 Jun '15, 07:39) Justin Willey

Try the following Goggle search...

"Internal warning" "Request task"

What you will see is some finger-pointing at Queries From Hell... another reason to fire up Foxhound... or run Application Profiling if you have the stamina :)

(18 Jun '15, 08:29) Breck Carter

When you say "it's a virtual machine" are you sure that SQL Anywhere is getting more than one processor?

Are you personally in control of the system setup, or is it in the hands of a True Believer in The Church Of The Magical VM?

(18 Jun '15, 08:33) Breck Carter

The latter I fear - though reality has begun to creep in :). It does seem though we are getting to use all the processors.

(18 Jun '15, 11:45) Justin Willey
2

What I have seen so far for SQLA to become temporarily unresponsive was a full table scan on a huge table or the growth of the database file blocking until the growing finished...So in a VM environment I would have a look at IO.

(19 Jun '15, 06:08) Martin

Will do, thank you for the suggestion.

(19 Jun '15, 07:17) Justin Willey
2

It might be a good idea to check if the VM infrastructure logged any events at the time of freeze up. If someone takes a snapshot of the VM or the VM is moved between physical hosts (vMotion in VMWare), the database would be unresponsive.

(19 Jun '15, 09:50) Mikel Rychliski
showing 2 of 12 show all flat view
Your answer
toggle preview

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text](http://url.com/ "title")
  • image?![alt text](/path/img.jpg "title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported

Question tags:

×11

question asked: 12 Jun '15, 12:43

question was seen: 454 times

last updated: 19 Jun '15, 09:50