The forum will experience an outage sometime between February 10 at 7:00pm EST and February 12 at 11:59 EST for installation of security updates. The actual time and duration of the outage are unknown but attempts will be made to minimize the downtime. We apologize for any inconvenience.

I have observed the following behavior in different 11.0.1 versions, for which I don't have any explanation:

Scenario:

  • I have a database server providing 4 databases.
  • a table is quite large (some million entries)
  • the table will not completely fit into cache anymore.
  • Sequential scan of this table need 2-4 minutes.

The problem:

As soon as a sequential scan on such a table starts all new connections to the database server are accepted but NOT processed. This means all other db-clients are waiting the 2-4 minutes before they can interact with the database. In this case it doesn't matter which of the 4 databases the client wants to access. So even connections to one of the other databases (not doing the sequential scan right now) will be frozen. The same for dbconsole it will also wait until the sequential scan finishes.

In the request log I just see that the connections are accepted but no single statement (even not the if ... select @@version ...) which I normally see as the first statement will be executed until the sequential scan is finished.

Have I configured anything wrong? Is this a desired behavior or a bug?

asked 01 Dec '10, 09:06

Martin's gravatar image

Martin
8.6k114149237
accept rate: 14%

What does sa_conn_info() display? May the sequential scan occupy all worker threads as effect of intra-query parallelization (though that should not starve others, methinks)? What's your -gn setting?

(01 Dec '10, 11:30) Volker Barth

Just to add: Does have setting the max_query_tasks option to a modest value, say, 2-5, any impact? Or the use of the option priority for the table-scanning conection?

(01 Dec '10, 11:51) Volker Barth

Can you fill in some details: OS Platform? Build number? Computer specs? Disk configuration?

(01 Dec '10, 12:27) Mark Culp

@Volker: -gn is not set, so default, sa_conn_info only works after the sequential scan finished

(01 Dec '10, 14:01) Martin

@Mark: last incident was a 11.0.1.2376 on a Windows Server 2003, 32 bit, but I have seen same behavior on a Win Server 2008 R2 64 Bit and 64 Bit SQLA. Database files are located on a direct attached RAID.

(01 Dec '10, 14:04) Martin

How many cores in your computer? How complex is the query? is it a simple "select * from T" or does it have lots of subqueries, group bys, etc etc. I'm wondering if all of the database server workers are being used by this query and hence there is none left for any other requests to be processed? What you have posted so far is consistent with this hopothesis. You could try decreasing your MPL - see http://dcx.sybase.com/index.html#1101en/dbadmin_en11/running-s-3713576.html

(01 Dec '10, 14:31) Mark Culp

@Mark: it is just a simple "select * from table where column1=x and column2=y" an index on both columns exists, anyway the table statistics have been corrupt therefore the optimizer used a sequential scan. The server should have at least 4 cores (might have 8 / 2 CPUs) I would have to double check if relevant.

(01 Dec '10, 14:49) Martin

@Mark: Wouldn't then increasing the MPL (in contrast to decreasing) be more useful?

(01 Dec '10, 17:52) Volker Barth

@Volker: My theory was that all workers were in use by the query, but I am unsure if a simple "select * from T where <simple-predicate>" would use all 20 workers on a server? You are correct that increasing the number of workers (-gn X) could also work but it may not in this case. I was suggesting (but used the wrong term :-) to decrease the max_query_tasks option - see http://dcx.sybase.com/index.html#1101en/dbusage_en11/parallelism.html - to reduce the number of workers used by the query.

(01 Dec '10, 22:29) Mark Culp

@Mark: Ah, I see (and think my theory seems similar - cf. my second comment). Now Martin might have to validate this:)

(02 Dec '10, 08:28) Volker Barth
1

@Everyone: I'd offer a bounty on this one if that was possible... let's just say this question has an attentive audience :)

(02 Dec '10, 14:57) Breck Carter

@Breck: We need to wait for Martin to run the experiment of either increasing -gn or decreasing max_query_tasks to see if our theory is correct. I am also interested to hear if there is a bug lurking in the depths or is this just the case of a run-away query consuming the entire server (and hence maybe we need to do something about that case?)

(02 Dec '10, 21:10) Mark Culp
1

I will try to reproduce this in a lab environment, the situation on the productive server could be resolved by fixing the corrupt statistics. I will keep you updated as soon as I have new insights.

(03 Dec '10, 08:07) Martin

What isolation level is your application running at?

(07 Dec '10, 12:13) Glenn Paulley

The isolation level is 0

(09 Dec '10, 09:45) Martin

What happens if you mark the query FOR READ ONLY? (I hate that this isn't the default)

(16 Aug '11, 11:47) PhilippeBert...

ODBC is used for access, so my understanding is that a read-only cursor is used. Anyway a good hint that in stored procedures the default is FOR UPDATE.

(16 Aug '11, 12:15) Martin
More comments hidden
showing 5 of 17 show all flat view

I am having a problem that is very similar to this, all connections wait. I am not sure what query they wait for but the read-IO is high and constant on the DB for the entire duration of the wait period (in my case this DB is ~300gb and the wait time is about 30 mins, IO about 2mb/s).

I cannot easily repeat the problem on demand as I dont know what query is causing it.

There is however only 1 table in this DB, a table of blobs with an ID.
I havent yet checked the stats on this table to see if they are very out. I have been removing a lot of rows from this table recently over night.

Would it be useful for the future / Sybase product to try repeat this issue or to get stats on the table before refreshing them (which seems to be a possible solution, given the above and the deletes).

It is very serious because of the time involved in this case (~30 mins) but also because Sybase Central cannot itself connect and hence it is not possible to attempt to trouble shoot from within Sybase Central.

permanent link

answered 15 Aug '11, 12:23

ivankb's gravatar image

ivankb
265101121
accept rate: 50%

edited 15 Aug '11, 12:24

Which version are you using?

(16 Aug '11, 03:07) Martin

Yes sorry, 12.0.1 - EBF 3389

(16 Aug '11, 15:15) ivankb
Your answer
toggle preview

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text](http://url.com/ "title")
  • image?![alt text](/path/img.jpg "title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported

Question tags:

×123

question asked: 01 Dec '10, 09:06

question was seen: 1,319 times

last updated: 17 Aug '11, 03:03