Running SQLAnywhere 11.01, I am executing a relatively simple select statement against a narrow table with approximately 500 million rows, inner joined to some smaller tables (one with 3.5 million rows and the others with only about 2000 rows each,) using the iSQL client. The first query returns in less than 5 seconds. Subsequent queries that are similar, but for which I have changed one of the parameters, take approximately 40 seconds.
The parameter that I am changing is in one of the secondary tables, but the foreign key is part of the primary key as well as a clustered index on the primary table.
I have tried changing various connection parameters (optimization_goal, optimization_level, max_client_statements_cached and max_plans_cached). All of these are currently at the default. Changing them did not change the discrepancy between performance of my first statement execution and subsequent executions.
I wonder if there is an option I can set that might eliminate this difference in performance in favor of the speed I see on the first query run?
I'm happy to provide more information if that would be helpful.
asked 19 Jul '11, 15:40
Thank you for the suggestion Volker.
The graphical plans between the two different executions don't look notably different. Overall statistics are the same within fractions of seconds, as is the general plan.
I changed the query to select based on the foreign_key value itself (happens to be a foreign key to a date dimension), which is part of the primary key and clustered index, rather than on the value from the secondary table. In addition to improving performance, the differences in the first and subsequent execution times disappeared. So it seems to have something to do with using a secondary table value in my selection criteria. The good news is that we're in the proof-of-concept phase so its no problem for me to adjust the schema accordingly.
answered 19 Jul '11, 18:44
Check in your plans what is written next to the keyword "Selectivity" and also what is the difference between "estimated" "RowsReturned" and "Actual (Total)". I guess you will see a diskrepancy like estimated=1, actual=a huge number.
Such a diskrepancy is something which I have experienced, then the optimizer seems to just take into account the statistics of finding one foreign key in the primary table, which results in the join of secondary (or in your case probably the 3rd table) with primary first before the search criteria is applied to the join result. In contrast the optimal strategy would have been to apply first the search criteria to limit the secondary table matches and then do the join.
You might be able to influcence the behavior with a user estimate where ("criteria" , 0.00001) or something similar.
answered 20 Jul '11, 07:34