Statistics that are generated in the table syscolstat are really needed for the database? I'm having problems because of low performance statistics. I periodically drop statistics and improves performance significantly. Is there any possibility to not generate them? Thanks, Leonardo. asked 05 Oct '12, 09:26 LGregianin |
Yes, very much so. Collecting a statistical distribution of values helps provide the database optimizer a (rough) estimate of the amount of work it will need to perform to scan for sargable predicates in an index and also helps with join strategy estimates inside the optimizer.
You can use ALTER STATISTICS to control the collection of statistics on particular columns/tables. Generally it is not recommended to turn this feature off. Statistics are automatically maintained by the database engine as SELECT/INSERT/UPDATE/DELETE operations are executed. Generally, statistics can easily become 'incorrect' over time due to two major reasons:
For older versions of SQL Anywhere, the only way to correct this information is to manually issue the CREATE STATISTICS command when you know the information is incorrect. (Either slowly over time, or perhaps after large data operations). You can check what the current values are in the histogram data for your table by using the dbhist utility or the sa_get_histogram() stored procedure. The new statistics governor in SQL Anywhere 12 is intended to help "automate" the check for these two issues - when it detects an issue it will try to issue a "CREATE STATISTICS" statement for you, as needed. Since this feature did not exist in older versions, it may be required to review the statistical data as part of your migration strategy. answered 05 Oct '12, 16:23 Jeff Albion Jeff Albion, If the statistics generated are used just to improve the performance of the database, "because" this happens misconduct? Incorrect normalization of the database?
(09 Oct '12, 15:35)
Walmir Taques
Replies hidden
1
I'm not 100% sure about your question, but if you're implying that the above described behaviour is possibly a "bug", I would instead suggest this is more of a known product limitation that we're always trying to improve upon. As much as we try to mathematically keep track of accurate information, as I mentioned, over time this can become "skewed" (due to rounding issues of the actual percentage values themselves, plus the fact that we do not automatically re-divide up the histogram ranges without CREATE STATISTICS being run). Since we don't know that the statistical information is really different than what appears for all of the values stored in the table (until a manual SQL request comes in to re-evaluate the statistics), the database doesn't have a chance to correct itself. The newer versions of SQL Anywhere now try to limit this behaviour by doing this dynamic monitoring of the statistics for you rather than assuming that the user will be constantly monitoring the statistics themselves and adjusting as necessary manually (older SQL Anywhere versions). Aside: "Normalizing a database" generally implies something about the way entities are arranged in a schema (from a theoretical perspective) and is not related to database statistics (used for optimization at runtime).
(10 Oct '12, 17:23)
Jeff Albion
Hi, @Jeff Albion I do not mean to imply anything, just had a little doubt about "whether normalization of a database" could influence your performance with statistics. Now it's clearer. Sorry. Thank you!
(11 Oct '12, 13:51)
Walmir Taques
|
Leonardo, Please post the plan files (or sent them to me directly at anisoara.nica@sap.com): in dbisql: 1. call sa_flush_cache() before each execution 2. in the 'Plan Viewer', set the option 'Detailed and node statisitcs' 3. 'Get Plan' 4. save the plan in a file using 'Save As...' 5. repeat 1-4 for each combination {without drop stats, drop stats} X {with order by, without order by} Thanks Ani answered 09 Oct '12, 13:54 Nica _SAP |
These are the results: Before drop statistics without order by clause Before drop statistics with order by clause After drop statistics without order by clause After drop statistics with order by clause In my previous tests I was not running the command: answered 10 Oct '12, 13:42 LGregianin Can you please email or attach the query plan as per the instructions. The screen caps and PDFs do not contain all of the information nor provides an easy mechanism to analyze the issue. To get the plan that we require,
Please ensure that sa_flush_cache() is called for each iteration as noted in Ani's previous post.
(10 Oct '12, 14:25)
Chris Keating
|
I too would like to know if this is possible. I'm seeing the same issue after upgrading from version 9 to 12. If I drop statistics from the PK on a 37 million row table, performance goes through the roof. Response comes back in 2 seconds instead of 60+.
Leonardo and Grant,
Would you please post the graphical plans with statistics before ( I.e.,the slow plan), and after (i.e., the fast plan) you run 'drop statistics' statement. It would be useful to investigate these particular issues based on your plans.
With many thanks, A. Nica
Hi Nica,
Sybase 9.0.2 without drop statistics
Sybase 9.0.2 with droped statistics
Test in Sybase 12.0.1
UPDATE:
Select:
More details:
Strange both plans look identical, are your charts correct?
Yes @Martin, the graphics are correct. The only difference between the two queries is drop statistics [all my tables] statement.