Hello, We would like to know what is the default order sybase would use when using distinct in the select query. Here is the use case (Simple table with an int, decimal, string and date columns)
create table ##tab1 (intcol int, decicamlcol decimal(8,4), stringcol varchar(100), datecol date) go
Query 1: select intcol, datecol from ##tab1
Query 2: select distinct intcol , strcol , datecol, decimalcol from ##tab1
Query 1 - sorts the results first by the intcol and then on the datecol as given in the select clause.
However Query 2 - (which has a strcol in the selcet), orders the results in the reverse fashion starting with the decicamlcol, datecol, then on the strcol followed by the intcol. We did see this observation only when a string column is used in the select.
Can some please comment if there is a pattern sybase uses to order results when distinct is used and with out explicitly mentioning "order by" clause.
Thank you. Naresh
The results should be free to change order unless explicitly directed otherwise.
As Volker mentioned, the order may depend on what is optimal for that database server. Because of this, different plans (index/table scan, nested join/hash join, etc.) will cause evaluation of the query in different orders and can easily change the order of the output.
If you want an ordering you can rely on, you should specify this explicitly.
answered 11 Oct '11, 12:32
Why would you like to rely on that?
AFAIK, the query optimizer is totally free in returning the result set in whatever "order" it considers optimal from a performance reason (including in no obvious order at all) - unless you specify an explicit ORDER BY clause.
Without an ORDER BY the same query may return the result set in different "orders" for different runs - as the query optimizer is free to choose different plans for each run.
Therefore I guess your observations might vary when used with different machines, different table contents and the like...
BTW: Ary you using SQL Anywhere or Sybase ASE? - My comment (and the whole forum) is focussed on SQL Anywhere...
answered 11 Oct '11, 12:24
The underlying assumption in the original question is that sorting is used for duplicate elimination.
While a SQL Anywhere server does support sort-based duplicate elimination, more often the optimizer will pick the (more efficient) hash-based duplicate-elimination strategy, which inherently does not generate a duplicate-free result set in a reliable order.
To guarantee the ordering of tuples in a result set, one MUST use an ORDER BY clause.
answered 12 Oct '11, 17:15