Select Distinct vs Group By

In general, will you see faster performance selecting distinct or selecting and then grouping by each column to achieve the same results?

Probably depends on the data I know, but just looking for general pointers and factors to consider one way or the other.

performance select-distinct group-by

asked 16 Jun '11, 14:25

Siger Matt
3.3k●56●72●101
accept rate: 15%

Can you give a SQL sample of the two things you want to compare?

(16 Jun '11, 15:41) PhilippeBert...

Comment Text Removed

My money's on the plans and costs being virtually identical.

(17 Jun '11, 12:57) Breck Carter

Replies hidden

I don't know, but if Breck is right, then I would recommend SELECT DISTINCT as long as you only want to eliminate duplicate rows.

In my book, GROUP BY is typically used with aggregates like COUNT() or SUM(), and that seems a different usage. Therefore I would suggest to separate both requirements.

(17 Jun '11, 16:58) Volker Barth

I agree with you and Glenn... since the performance is the same, eschew obfuscation!

(18 Jun '11, 12:57) Breck Carter

flat view

One Answer:

active answers oldest answers newest answers popular answers

I would not rewrite SELECT DISTINCT as a grouped query, for two main reasons.

In releases of SQL Anywhere older than Version 8, distinct and grouped queries had precisely the same implementation and their execution characteristics were identical. Starting with Version 8, the server constructs different DFO operators for the operations, however much of the underlying implementation remains the same (using shared code) and hence I would be surprised if there were substantial performance differences between the two.

The reasons I would refrain from rewriting SELECT DISTINCT as a grouped query are:

From an application programming standpoint, SELECT DISTINCT is more straightforward and reflects the intent of the query. A grouped version may cause subsequent confusion at a later time.
Perhaps more importantly, there may be differences in how complex queries involving grouping are rewritten during semantic rewrite optimization. Rewrite optimization is based largely on the discovery of patterns, and it would not be surprising to find that a complex query using SELECT DISTINCT is rewritten differently than its equivalent that uses GROUP BY.

The latter item can be substantially important with respect to how complex queries are simplified by the query optimizer, and for that reason I would retain the syntax of the intended semantics (of SELECT DISTINCT) rather than try to do anything fancier.

permanent link

answered 18 Jun '11, 11:13

Glenn Paulley
10.8k●5●77●106
accept rate: 43%

flat view

Your answer

toggle preview

community wiki:

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

Markdown Basics

*italic* or _italic_
**bold** or __bold__
link:[text](http://url.com/ "title")
image?![alt text](/path/img.jpg "title")
numbered list: 1. Foo 2. Bar
to add a line break simply add two spaces to where you would like the new line to be.
basic HTML tags are also supported

learn more about Markdown

Question tags:

performance ×276
group-by ×11
select-distinct ×6

question asked: 16 Jun '11, 14:25

question was seen: 13,814 times

last updated: 18 Jun '11, 12:57

SAP SQL Anywhere

Select Distinct vs Group By

Follow this question

Related questions