I have observed the following behavior in different 11.0.1 versions, for which I don't have any explanation: Scenario:
The problem: As soon as a sequential scan on such a table starts all new connections to the database server are accepted but NOT processed. This means all other db-clients are waiting the 2-4 minutes before they can interact with the database. In this case it doesn't matter which of the 4 databases the client wants to access. So even connections to one of the other databases (not doing the sequential scan right now) will be frozen. The same for dbconsole it will also wait until the sequential scan finishes. In the request log I just see that the connections are accepted but no single statement (even not the if ... select @@version ...) which I normally see as the first statement will be executed until the sequential scan is finished. Have I configured anything wrong? Is this a desired behavior or a bug? asked 01 Dec '10, 09:06 Martin |
I am having a problem that is very similar to this, all connections wait. I am not sure what query they wait for but the read-IO is high and constant on the DB for the entire duration of the wait period (in my case this DB is ~300gb and the wait time is about 30 mins, IO about 2mb/s). I cannot easily repeat the problem on demand as I dont know what query is causing it. There is however only 1 table in this DB, a table of blobs with an ID. Would it be useful for the future / Sybase product to try repeat this issue or to get stats on the table before refreshing them (which seems to be a possible solution, given the above and the deletes). It is very serious because of the time involved in this case (~30 mins) but also because Sybase Central cannot itself connect and hence it is not possible to attempt to trouble shoot from within Sybase Central. answered 15 Aug '11, 12:23 ivankb Which version are you using?
(16 Aug '11, 03:07)
Martin
Yes sorry, 12.0.1 - EBF 3389
(16 Aug '11, 15:15)
ivankb
|
What does sa_conn_info() display? May the sequential scan occupy all worker threads as effect of intra-query parallelization (though that should not starve others, methinks)? What's your -gn setting?
Just to add: Does have setting the max_query_tasks option to a modest value, say, 2-5, any impact? Or the use of the option priority for the table-scanning conection?
Can you fill in some details: OS Platform? Build number? Computer specs? Disk configuration?
@Volker: -gn is not set, so default, sa_conn_info only works after the sequential scan finished
@Mark: last incident was a 11.0.1.2376 on a Windows Server 2003, 32 bit, but I have seen same behavior on a Win Server 2008 R2 64 Bit and 64 Bit SQLA. Database files are located on a direct attached RAID.
How many cores in your computer? How complex is the query? is it a simple "select * from T" or does it have lots of subqueries, group bys, etc etc. I'm wondering if all of the database server workers are being used by this query and hence there is none left for any other requests to be processed? What you have posted so far is consistent with this hopothesis. You could try decreasing your MPL - see http://dcx.sybase.com/index.html#1101en/dbadmin_en11/running-s-3713576.html
@Mark: it is just a simple "select * from table where column1=x and column2=y" an index on both columns exists, anyway the table statistics have been corrupt therefore the optimizer used a sequential scan. The server should have at least 4 cores (might have 8 / 2 CPUs) I would have to double check if relevant.
@Mark: Wouldn't then increasing the MPL (in contrast to decreasing) be more useful?
@Volker: My theory was that all workers were in use by the query, but I am unsure if a simple "select * from T where <simple-predicate>" would use all 20 workers on a server? You are correct that increasing the number of workers (-gn X) could also work but it may not in this case. I was suggesting (but used the wrong term :-) to decrease the max_query_tasks option - see http://dcx.sybase.com/index.html#1101en/dbusage_en11/parallelism.html - to reduce the number of workers used by the query.
@Mark: Ah, I see (and think my theory seems similar - cf. my second comment). Now Martin might have to validate this:)
@Everyone: I'd offer a bounty on this one if that was possible... let's just say this question has an attentive audience :)
@Breck: We need to wait for Martin to run the experiment of either increasing -gn or decreasing max_query_tasks to see if our theory is correct. I am also interested to hear if there is a bug lurking in the depths or is this just the case of a run-away query consuming the entire server (and hence maybe we need to do something about that case?)
I will try to reproduce this in a lab environment, the situation on the productive server could be resolved by fixing the corrupt statistics. I will keep you updated as soon as I have new insights.
What isolation level is your application running at?
The isolation level is 0
What happens if you mark the query FOR READ ONLY? (I hate that this isn't the default)
ODBC is used for access, so my understanding is that a read-only cursor is used. Anyway a good hint that in stored procedures the default is FOR UPDATE.