I made an interesting observation, at least for me, when i checked the execution plan of some of our slowest queries.
Easy Example:

CREATE TABLE "DBA"."TMP_Date" (
"Datum" DATE NOT NULL,
PRIMARY KEY ( "Datum"),
PCTFREE 0
);

insert into DBA.TMP_Date
  select dateadd(day,row_num,current date - 100) from sa_Rowgenerator(1,200);

select * from DBA.TMP_Date where Datum >= '2013-04-19'

If you look at the plan of the last query you will see that statistics are used to guess the selectivity of the sole condition.

2013-04-19 <= Datum <= * ASC : 50.499999523% Statistics

If you change the query to:

select * from DBA.TMP_Date where Datum >= current date

Which, as of the date of writing, reveals the same result. The optimizer doesn't use the statistics in this scenario, at least in 11 and 16 where i tested it.

TMP_Date.Datum >= today() : 25% Guess

The same happens with current timestamp btw. It seems the optimizer doesn't handle current date, today() or current timestamp as a constant and i can see a reason for this. If the plan is cached and reused on the next day e.g. then the used statistics could be wrong. The thing is we used current date quite a lot in our queries and it happens to have a severe impact in a few of them. In the above example it doesn't make a difference i think, but the "constant" guess of 25% selectivity for every condition with current date, ... leads in a few of our queries to the usage of a not very efficient index. When using the statistics the selectivity would have been 0.01% and another condition with an index had 20 % selectivity. So the optimizer chooses that index (20% < 25%) and the resulting query is about ten times slower than using a constant date or adding an user estimate. We added therefore user estimates in a few places and tried to use constants where it's possible, but that's not the most elegant solution in my opinion.

Try the following in the SA 16 demo db, to see a relevant (execution time doubled) example:
Add an index on OrderDate to the SalesOrders table:

CREATE INDEX "OrderDate" ON "GROUPO"."SalesOrders" ( "OrderDate" ASC );

Now take a look at the plan of the following two (in my view identical) queries:

SELECT * FROM "GROUPO"."SalesOrders" so key join "GROUPO"."Customers" cu 
  where cu.Country = 'Canada' and 
  orderdate >= '2001-06-26'

SELECT * FROM "GROUPO"."SalesOrders" so key join "GROUPO"."Customers" cu 
  where cu.Country = 'Canada' and 
  orderdate >= dateadd(day,-datediff(day,'2001-06-26',current date),current date)

I've just seen that changing the condition to

orderdate = dateadd(day,-datediff(day,'2001-06-26',current date),current date)

and voila the statistics and therefore the OrderDate index is used.

asked 19 Apr '13, 06:09

Markus%20D%C3%BCtting's gravatar image

Markus Dütting
53641220
accept rate: 30%

edited 23 Apr '13, 11:36

Our solution was to use static values where possible and to omit the usage of current date or dateadd function in queries, too.

(19 Apr '13, 07:16) Martin

Plan caching is probably not involved; see Plan caching

Surely those queries are not examples of your "slowest queries". Before drawing conclusions about how CURRENT DATE is (mis)treated by SQL Anywhere, please try some more complex queries, on a table with more than one column and 200 rows... it is entirely possible SQL Anywhere could be enhanced, but a better argument is required.

In this example, both queries show "Optimization method - Bypassed then optimized". Both queries end up with the same plan (Index Only Retrieval Scan) so it doesn't matter one bit whether statistics or guesswork was used... in other words, the performance will probably be the same.

The query you show is a range query, and the index is effectively clustered, so you may have a hard time getting the optimizer to choose any other plan. The relationship between range queries and clustered indexes is a warm and cuddly relationship... they are almost inseparable, as in "selectivity = who cares, leave us alone, we're fine!" :)

AFAIK statistics help with choosing a plan... after that they play no role in executing the query (Memo to Engineering: Please Cancel or Allow that bold statement... I think it's true, based on this discussion)

permanent link

answered 22 Apr '13, 08:15

Breck%20Carter's gravatar image

Breck Carter
27.0k424581829
accept rate: 21%

edited 22 Apr '13, 08:29

Yes you are right, the example i provided is not one of our "slowest" queries. But it nicely shows imho that current date, ... is not handled as a constant. If i were to show a real example from our application, where i stumbled across this "phenomenon", i would have to create about 15 tables and 2-3 Megabytes worth of data. And i justed wanted to show that selectivity estimates in combination with current date, current timestamp are not using statistics. I also already wrote about a statement where a constant date showed a nearly 100% correct selectivity estimate of 0.01 % and there also exists an index on the appropriate column. There is a second predicate with a selectivity of 20% and because the engine seems to (imho correctly) favor indexes with lower selectivity that index is chosen because of the 25% selectivity estimate for everything current ... related.

The "real" table has about 260000 rows. So using the index on the timestamp leaves about 26 rows to process, the index which is used without specifying a user estimate leaves 52000 rows. So it has a severe impact in our application. Btw. user estimates ( Datum >= current date, 0.01) are getting ignored per default ( user_estimates = Override-Magic) if statistics are used. So i'm quite sure that statistics and current date, today() or current timestamp are not getting together at the moment but the optimizer seems to use a constant "magic" value of 25% as selectivity estimate. I don't say that this is an error, but in my (limited) view it seems more appropriate to use the quite good statistics SQL Anywhere maintains.

(23 Apr '13, 08:34) Markus Dütting
Your answer
toggle preview

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text](http://url.com/ "title")
  • image?![alt text](/path/img.jpg "title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported

Question tags:

×240
×27
×11

question asked: 19 Apr '13, 06:09

question was seen: 1,055 times

last updated: 23 Apr '13, 11:36