In general, will you see faster performance selecting distinct or selecting and then grouping by each column to achieve the same results? Probably depends on the data I know, but just looking for general pointers and factors to consider one way or the other. asked 16 Jun '11, 14:25 Siger Matt |
I would not rewrite SELECT DISTINCT as a grouped query, for two main reasons. In releases of SQL Anywhere older than Version 8, distinct and grouped queries had precisely the same implementation and their execution characteristics were identical. Starting with Version 8, the server constructs different DFO operators for the operations, however much of the underlying implementation remains the same (using shared code) and hence I would be surprised if there were substantial performance differences between the two. The reasons I would refrain from rewriting SELECT DISTINCT as a grouped query are:
The latter item can be substantially important with respect to how complex queries are simplified by the query optimizer, and for that reason I would retain the syntax of the intended semantics (of SELECT DISTINCT) rather than try to do anything fancier. answered 18 Jun '11, 11:13 Glenn Paulley |
Can you give a SQL sample of the two things you want to compare?
My money's on the plans and costs being virtually identical.
I don't know, but if Breck is right, then I would recommend SELECT DISTINCT as long as you only want to eliminate duplicate rows.
In my book, GROUP BY is typically used with aggregates like COUNT() or SUM(), and that seems a different usage. Therefore I would suggest to separate both requirements.
I agree with you and Glenn... since the performance is the same, eschew obfuscation!