I have a query involving 4 tables. 3 have a foreign key relation. The optimizer estimates differ for all tables from the actual values.

          Estimate      Actual Rows returned 
Table A       2.8              1
Table B      4600           135000
Table C        1            135000
Table D     17700             255

The tables are joined from top to bottom. My desired plan would be the otherway round because I know, that only few rows in Table D will match.

A similar query for the same 4 tables results in:

           Estimate      Actual Rows returned 
Table D       490              290
Table C        3              1091
Table B        0.2             272
Table A       2.8              1

Both statements have the same number of search criteria one on Table A and one Table D. It only depends on the change of the value which is searched for in Table D. Difference in runtime between plan 1 and plan 2 is 3.5 to 0.01 seconds.

After some more tests it seems, that as soon as the estimate for matching rows in Table D falls below 1000 the better plan is used. A user estimate on the criteria for Table D is ignored by the optimizer, I wonder why? For the other criteria it is accepted.

What are your suggestions how to help the optimizer in such a situation?

asked 22 Jul '11, 11:40

Martin's gravatar image

Martin
8.6k117151237
accept rate: 14%

Have you tried to use the FORCE INDEX() clause? (Though overriding the optimizer is not generally recommended...)

Note: I don't claim it will help in this case...

(22 Jul '11, 12:15) Volker Barth

Normally the database server maintains the selectivity estimates for the data/columns in each table automatically but they can get out of sync under some scenarios. When this happens running CREATE STATISTICS for each table/column may help.

Example:

create statistics A ( colA );
create statistics B ( colB );
create statistics C ( colC );
create statistics D ( colD );

where colA, colB, colC, colD are the columns from the respective tables that you are wanting to improve the selectivity estimates.

Read more about CREATE STATISTICS in the docs.

permanent link

answered 22 Jul '11, 13:04

Mark%20Culp's gravatar image

Mark Culp
23.2k9132273
accept rate: 40%

Any recommendation how regular this should be done? Should it be part of a reqular maintainance?

(25 Jul '11, 03:03) Martin
Replies hidden
1

You shouldn't have to do it anytime, but some users do sequence(s) of operations that cause the statistics to get out of sync with reality. Most users never need to regenerate the statistics; at least one user has to do it every day (due to their particular insert/update/delete loads that cause the stats to be skewed). Improvements have also been made over the years to the code that keeps the histogram statistics.... so there isn't a "one answer fits all" answer. You need to make the determination if you need to recreate statistics, and if you do which columns stats need to be adjusted and what frequency it needs to be done. FWIW: You can use the dbhist tool to check the existing statistics for a column.

(25 Jul '11, 11:11) Mark Culp
Your answer
toggle preview

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text](http://url.com/ "title")
  • image?![alt text](/path/img.jpg "title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported

Question tags:

×27

question asked: 22 Jul '11, 11:40

question was seen: 722 times

last updated: 25 Jul '11, 11:11