I'm coding a web-based application where a lot of
I only need one column from each of the joined tables.
The columns in the join conditions are all primary or foreign keys, so they're indexed.
From what I gather based on incomplete production data, the values in the join conditions tend to repeat themselves quite a bit. For instance, there may be 10.000 users, but only 100 or so satisfy the condition, and about a dozen users cover about 80% of all cases. The same applies to the language table.
I need to generate database views with the
In terms of overall performance, what's better in your expert opinion?
Views with joins as above or inlined deterministic functions on the work table to retrieve the column I need from each of the other three tables?
SQL user-defined functions add considerable overhead to computation that each and every query at execution time must pay for. In addition, query optimizers (including SQL Anywhere's) are in the main unable to effectively optimize queries containing user-defined functions because it is exceedingly complicated to ensure equivalent semantics under all conditions with even basic rewritings. An exception with SQL Anywhere (at least) is function in-lining, but this can only be done if the UDF contains only a
So I would caution against using UDFs in "ordinary" queries without doing some performance testing. In some cases I can envision the use of a UDF leading to performance gains because the overall statement then breaks down into separate components, making it easier for the query optimizer to find an efficient plan. But these situations would, in my view, occur rarely.
I have written about some of the tradeoffs of SQL rewritings along these lines in this article on my blog.
answered 29 Aug '10, 13:41