query plan changes radically

Hi, I have a question about some behavior I am seeing from the query optimizer in version 12.0.1.3311. I have seen this behavior on both windows and linux. At least for now, I do not want to get into the specifics of the actual query or the resultant query plan. I can post the schemas and query plan as a follow-up if necessary, but right now I want to describe my problem in the abstract and see if anyone can put me on the right track.

What I am seeing is that I can have an SQL query sitting in an ISQL window connected to the database. I check the query plan, it looks good, using the proper indices, etc. I run the query, it comes back in under a second. I can run it multiple times in rapid succession, all is well.

However, for some reason I have been unable to discern, at some point this query plan is thrown out the window and in its place I get this heinous nightmare of a plan doing a sequential scan of a table with 20 million records in it. The query goes from .5 seconds to 35 minutes or longer. This is on a database that is virtually or truly quiescent, little to no transactional activity occurring at all, only variations on the same query.

If I then run a create statistics on the table that is the largest (in terms of number of records) involved in the query, I will get back to my original, good query plan and excellent query performance. But, at some point in time it will degrade again.

I could understand the query plan getting fouled up if I had an unbalanced index or was adding large volumes of data to a table that is part of the query. But as I said I've observed this behavior on a database with literally no transactions occurring whatsoever, queries only.

I've run dbvalid against every table in the database, it is clean. I have unloaded and reloaded the large table that is the centerpiece of the query. I have instituted a nightly update statistics on every table in the database. None of these measures have helped.

My next step is to apply the latest EBF but really I am grasping at straws with that.

So, my question is, can anyone think of any reason why the query optimizer would be behaving in the way that I describe, and any way I can stop it permanently from doing this?

optimizer long-running-query

asked 23 Oct '12, 22:33

Jim Miani
211●3●4●12
accept rate: 0%

flat view

7 Answers:

active answers oldest answers newest answers popular answers

4	We made a large number of improvments and fixes for performance in the last 1.5 year so better use the latest EBF. permanent link answered 24 Oct '12, 05:15 Hartmut Branz 406●2●9 accept rate: 0% flat view

The column statistics are updated during query execution: for a table or index scan, SQL Anywhere engine sees the opportunity to refine the column statistics as data is read from that scan. Hence, next execution of the quey may see slightly different statistics which explains the different plan. Your description of the problem indicates that there is something wrong in this algorithm for dynamically updating statistics. Please use the latest EBF as many issues were addressed in this area. Please post/email the graphical plans with statistics if you still see the problem with the latest EBF. A. Nica

permanent link

answered 24 Oct '12, 09:36

Nica _SAP
866●7●22
accept rate: 3%

flat view

It seems that all the tables in your queries do not have either primary keys or foreign keys defined. Is this true? Because of missing primary key - foreign key constrains, both queries have extremely underestimated cardinalities for intermediate results as it cannot be decided if the predicates are correlated (they actually are) and independence asumption among the predicates is assumed. The cost estimations of different plan alternatives are underestimated as well because of wrong cardinality estimations. I recommend to define foreign keys and primary keys as apropriate, and also revise the indexes you have using the SQL Anywhere Index Consultant. With your current schema, for this type of semantically complex queries, the query optimizer has very little information to use during query optimization.

The difference in plans in this case is explained by the different cache content (i.e., index pages, base table pages) when a particular query is run. Cache content is taken into account when cost estimations are computed for alternative plans considered by the optimizer during optimzation process.

permanent link

answered 24 Oct '12, 15:41

Nica _SAP
866●7●22
accept rate: 3%

Yes, you are correct that we don't define primary and foreign keys. We have a unique index on each table of course and this approach had served us well for years - until now. But if that is how the optimizer makes its decisions, I will look at redefining some of those indices as primary and creating foreign keys and see where that takes me. I'll report back with results one way or another so this post may be useful to someone else. Thanks again for your help.

(24 Oct '12, 20:10) Jim Miani

Replies hidden

If you're dealing with a relational database like SQL Anywhere, using primary keys and foreign keys is a highly recommended practise (most folks may say: a real necessity) anyway - not just for performance reasons.

(25 Oct '12, 04:00) Volker Barth

OK, so I took Nica's advice and I have added primary and foreign keys to all of the tables in this query. After I did that I ran the index consultant and it had no suggestions. I would say I have seen a modest improvement insofar as I have not as yet seen the behavior I initially complained about - the execution of a query with a bad plan cause problems for subsequent queries that formerly had a good plan, with no db transactions executed in between.

That being said I am still seeing really bad choices by the query optimizer. And after sleeping on it, I think the PK/FK solution does not explain why the query plan was changing as I initially described. I realize now I gave a bad example in my first two postings of the query plan. I showed you the plan results for two different queries. What I should have shown you was the plan for query A (good), the plan for query B (bad), then the plan for query A again after query B was run (now bad).

At any rate, I can get back to that state if need be. For the moment I'd like to explore why I have such bad plans in the state I have the database now. Perhaps the explanation for this behavior will shed some light on the initial problem as well.

So now, after adding primary and foreign keys to all these tables and reissuing a create statistics on each table, I got plans for some different queries again.

I link them here. The first plan is perfect and runs very fast, but please note that the optimizer is able to optimize the table "draw_key" right out of the query, that has bearing on the next query. So here is the good plan:

http://icanon.com/icanon/temp/goodplan1025.saplan

But here is a very similar query that has a bad plan, and runs very slow. The ONLY difference between the two is that I include one column from the draw_key table in the SELECT, which prevents this table from being optimized out of the SQL. For some reason including this table causes a really bad plan. I have no idea why.

http://icanon.com/icanon/temp/badplan1_1025.saplan

Lastly, a query hitting the same tables but asking for a different set of data in the select, and adding a group by - this one is really bad, I include it in the hopes the additional info will shed some light on my problem.

http://icanon.com/icanon/temp/badplan2_1025.saplan

If anyone can explain the differences in these plans, I would greatly appreciate it. I really am at a loss. As I said earlier, these table structures, indices and query constructs have served us quite well at several locations for several years. I cannot pinpoint what has changed, unless it is a tipping point of volume of data. I keep thinking database corruption, but as I said I ran dbvalid on every table in the database. I also unloaded, dropped, rebuilt and reloaded the draw_detl table which is by far the largest table involved in this query. So again, sorry for the very verbose post but very appreciative of any help I can get.

(25 Oct '12, 19:05) Jim Miani

flat view

This behavior is rare, but not unheard of.

If the latest EBF doesn't stop this heinous behavior, the folks at iAnywhere might want to look at a reproducible.

Then, try to identify "One Big Thing" that makes the good plan different from the bad plan... in your case, it might be an index on the big table (the good plan uses the index, the bad plan doesn't).

Then, modify the query to force the use of the index, as in FROM ... table-name FORCE INDEX ( index-name ) ...

Repeat the test to make sure performance is good AND it doesn't flop over into bad performance.

AND THEN... mark the code (add comments, etc) to make sure nobody forgets the FORCE INDEX because you may want to remove it in the future; e.g., when upgrading to a version that doesn't need it, or worse, upgrading to a version where performance is better without it.

permanent link

answered 24 Oct '12, 08:57

Breck Carter
32.5k●541●727●1050
accept rate: 20%

edited 24 Oct '12, 08:59

flat view

Thank you all for your replies. Responding to all at once, then: I applied the latest EBF (3769) to the windows db, did a create statistics on the relevant tables, and reran my test cases and came up with the same issue/behavior. I have played with forcing the index paths in the SQL but it doesn't help and even if it did, I would prefer not to do this as the final solution. I did read about and use the "alter statistics

auto update disable" on each table in the query. Is this the same as dropping the statistics (that Martin recommended). Whether it is or isn't, this did not solve the problem (and the plan still changed following queries only, with no transactions occurring). Finally, per Nica's suggestion, I have saved the plans with Detailed and Node statistics. I don't have enough reputation points here to upload them, so you can see them at these links:

bad plan: http://icanon.com/icanon/temp/badplan.saplan good plan: http://icanon.com/icanon/temp/goodplan.saplan

You'll see that the only difference between the two SQLs is that in the "bad" plan I attempt to introduce an additional criteria into the where clause on one of the involved tables (and that is on an indexed field). I also want to reiterate that once the bad query is run, if I rerun the good query - the exact SQL in the good plan - it will now be much slower, and have a bad plan. I did not include that plan in this post, but can if it would be useful. Also, I can post the table schemas if they are needed. Here's hoping someone out there can spot something I am doing wrong that is obvious and easy to fix! Thanks again.

permanent link

answered 24 Oct '12, 14:25

Jim Miani
211●3●4●12
accept rate: 0%

flat view

Just to close this post out in case anyone ever tries to read it and find their own solution - I eventually had to reorganize this query and eliminate the draw_key table from the search. No idea why and it should not have been necessary but there it is. The addition of primary and foreign keys did not help me at all so I did not continue down that path. Thanks to everyone for the attempts to help.

permanent link

answered 29 Oct '12, 07:40

Jim Miani
211●3●4●12
accept rate: 0%

I'd suggest to accept your own answer then (here's a HOWTO).

Apparently, the "no idea why" workarounds are not that satisfying but, hey, that's reality:)

(29 Oct '12, 07:53) Volker Barth

flat view

First follow Hartmuts advice, then if it is not solving your situation, try dropping statistics, if then the query plan is ok for you just ignore the statistics. (By the way, this is not my general advice so please don't flame me, but under certain circumstances like here it might be worth a try)

permanent link

answered 24 Oct '12, 07:28

Martin
9.1k●131●170●257
accept rate: 14%

flat view

Your answer

toggle preview

community wiki:

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

Markdown Basics

*italic* or _italic_
**bold** or __bold__
link:[text](http://url.com/ "title")
image?![alt text](/path/img.jpg "title")
numbered list: 1. Foo 2. Bar
to add a line break simply add two spaces to where you would like the new line to be.
basic HTML tags are also supported

learn more about Markdown

Question tags:

optimizer ×28
long-running-query ×18

question asked: 23 Oct '12, 22:33

question was seen: 4,251 times

last updated: 29 Oct '12, 07:53

SAP SQL Anywhere

query plan changes radically

Follow this question

Related questions