Say, I have a child table (1:n) and want to find out which (if any) parent has entries with different values in the child table.
Sounds unclear, methinks, so I try to be more concrete:
For example, in the SA 12 sample database, I want to find out whether any Customer has entries in the SalesOrders table with different regions. So, obviously, a GROUP BY is needed, and CustomerID is the parent id.
But how do I "count" the different region entries? - Since there are different entries per CustomerID, a simple count(*) won't do.
Classically, I have used min/max-comparisons for that, e.g.
But a count(distinct) query might seem more adequate, such as
However, the latter seems (a bit) worse in performance, possibly as it requires a duplicate elimination. (I respect that it does offer more information, as it really counts the different entries whereas the first query just returns those with at least 2 different entries.)
Is there any preferred version for queries like that (both for reasons of comprehensibility and performance)?
It's hard (impossible?) to tell for sure with such a small table, but my first guess that query 2 is not only simpler but faster seems to be born out by the Version 220.127.116.1166 runtimes and the plans after calling sa_flush_cache().
Here's the query I might have written; its plan is more complex than query 2 but the runtime looks the same (BUT AGAIN, the table is too small to tell anything)...
CALL sa_flush_cache(); SELECT CustomerID, COUNT(*) AS RegionCount FROM ( SELECT DISTINCT CustomerID, Region FROM SalesOrders ) AS CustomerRegion GROUP BY CustomerID HAVING RegionCount > 1 ORDER BY 1;
answered 14 Oct '10, 14:53
This is a nasty hack I've seen used before, I can't remember using it though.
answered 14 Oct '10, 14:36