11.0.1.2467

I'm coding a web-based application where a lot of selects typically go like this=

select cols from work_table bt 
      inner join this_user_table user1 on user1.col1 = bt.col1
      inner join this_language_table l on l.col = bt.col2
      inner join this_workflow_table w on w.col = bt.col3
      left outer join this_user_table user2 on user2.col2 = bt.col4

I only need one column from each of the joined tables.

Most work_tables have typically less than 5000 rows

  • this_user_table: about 10.000 rows
  • this_language_table: about 7500 rows
  • this_workflow_table < 10 rows

The columns in the join conditions are all primary or foreign keys, so they're indexed.

From what I gather based on incomplete production data, the values in the join conditions tend to repeat themselves quite a bit. For instance, there may be 10.000 users, but only 100 or so satisfy the condition, and about a dozen users cover about 80% of all cases. The same applies to the language table.

I need to generate database views with the selects above for each of my work tables.

In terms of overall performance, what's better in your expert opinion?

Views with joins as above or inlined deterministic functions on the work table to retrieve the column I need from each of the other three tables?

asked 21 Aug '10, 13:31

Vincent%20Buck's gravatar image

Vincent Buck
70681520
accept rate: 15%

edited 21 Aug '10, 14:26

Calvin%20Allen's gravatar image

Calvin Allen
1.5k232638


SQL user-defined functions add considerable overhead to computation that each and every query at execution time must pay for. In addition, query optimizers (including SQL Anywhere's) are in the main unable to effectively optimize queries containing user-defined functions because it is exceedingly complicated to ensure equivalent semantics under all conditions with even basic rewritings. An exception with SQL Anywhere (at least) is function in-lining, but this can only be done if the UDF contains only a SELECT statement with no other procedural logic.

So I would caution against using UDFs in "ordinary" queries without doing some performance testing. In some cases I can envision the use of a UDF leading to performance gains because the overall statement then breaks down into separate components, making it easier for the query optimizer to find an efficient plan. But these situations would, in my view, occur rarely.

I have written about some of the tradeoffs of SQL rewritings along these lines in this article on my blog.

permanent link

answered 29 Aug '10, 13:41

Glenn%20Paulley's gravatar image

Glenn Paulley
10.7k568104
accept rate: 43%

Your answer
toggle preview

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text](http://url.com/ "title")
  • image?![alt text](/path/img.jpg "title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported

Question tags:

×239
×21
×4

question asked: 21 Aug '10, 13:31

question was seen: 822 times

last updated: 29 Aug '10, 13:41