Please be aware that the content in SAP SQL Anywhere Forum will be migrated to the SAP Community in June and this forum will be retired.

Two tables:
A (col1, col2, col3, primary key (col1, col2))
B (col1, col2, col3, col4, col5, primary key (col1, col2))

With table B having thousands of records, which of the following statements would be the least expensive?

Option 1:
insert into A (col1, col2, col3) on existing skip select col1, col2, col3 from B

Option 2:
insert into A (col1,col2,col3) select col1, col2, col3 from B where not exists (select * from A where col1=b.col1 and col2=b.col2)

My gut tells me they're exactly the same. But I need more than my gut to go on.

asked 13 Aug '14, 05:12

Liam's gravatar image

Liam
36191118
accept rate: 0%

2

No, I don't claim to know the answer. But I wanted to add, if you're on SQLA 11 or higher, there's a 3rd method (example not verified):

MERGE INTO A (col1, col2, col3)
USING B ON A.col1 = B.col1 and A.col2 = B.col2
WHEN NOT MATCHED INSERT


IMHO all tree statements should be broken down by the query engine to similar sequences of execution, so there should be no real difference in performance.

But that's speculative, so to get more detailed information you might want to execute both statements in identical environments, capture the graphical plans and analyze them for deviations.

(13 Aug '14, 08:43) Reimer Pods
1

OK just a guess but I would think Option 1 is more efficient when dealing with a small number of rows and option 2 when dealing with large inserts. I believe some DBMS's just ignore the error with option 1.

Jim

(13 Aug '14, 19:00) J Diaz
Replies hidden

I'm with Jim on this one. Thanks guys

(17 Aug '14, 00:48) Liam

I believe some DBMS's just ignore the error with option 1.

FWIW, I believe "INSERT ... ON EXISTING ..." is primarily a SQL Anywhere vendor extension so I don't really understand the "some DBMS's" statement... - of course, other DBMS like MySQL have somwhat similar extensions like "INSERT ... ON DUPLICATE KEY UPDATE..."

(19 Aug '14, 04:33) Volker Barth

  1. Predicting the performance of real-world SQL queries by visually inspecting stripped-down sample code has almost no chance of success.
  2. A three-column table with only "thousands of rows" may be regarded as a very small table by SQL Anywhere, not worthy of optimization. Exceptions exist, of course; e.g. this question about a 6,000 row table with an INTEGER primary key index that consumes 1.1G of disk space :)
  3. In this case (comparing ON EXISTING SKIP with WHERE NOT EXISTS) it's not difficult to actually code and test BOTH versions, in the real-world form, and look at the Graphical Plan With Statistics... testing is the only reliable way to determine performance...
  4. ...and even then, it's not perfect, because when conditions change sometimes so do the plans.

Having said that, my guess is they will have the same plans and the same performance. I am VERY PROUD of my guesswork... alt text

permanent link

answered 19 Aug '14, 06:22

Breck%20Carter's gravatar image

Breck Carter
32.5k5417261050
accept rate: 20%

Your answer
toggle preview

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text](http://url.com/ "title")
  • image?![alt text](/path/img.jpg "title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported

Question tags:

×275

question asked: 13 Aug '14, 05:12

question was seen: 1,228 times

last updated: 19 Aug '14, 06:22