How to swap two unique values in a table? (SQL Anywhere 16) Thanks for any comments. Problem:
|
It is a bug in INSERT ... ON EXISTING UPDATE. I have filed a bug report (engineering case 761289). As @Volker suggests, the MERGE statement works well as a workaround (shown below). When an UPDATE statement causes a uniqueness violation error during processing it might be a problem resolved by later rows. The offending row is removed from the base table and moved to a "HOLD TABLE". After all rows are processed by the UPDATE, the rows in the HOLD TABLE are inserted into the base table and constraints are re-checked to see if the UPDATE resolved the problem after completing. See the help topics for Locks during updates and Data changes using UPDATE for more details. When an INSERT ... ON EXISTING UPDATE statement encounters a uniqueness error, the row is also moved to the HOLD TABLE but there is no subsequent insert from the HOLD TABLE into the base table, leading to an unexpected delete. A workaround is to use the MERGE statement, for example:
HOLD TABLE processing of MERGE statements is not affected by this issue. In version 8.0 through 9.0, the uniqueness violation during the INSERT...ON EXISTING statement would generate an error instead of deleting the row. In versions 10.0 and above the row is moved to the HOLD TABLE but not re-added to the base table, leading to the missing row. The MERGE statement is generally preferred to the INSERT..ON EXISTING as it clearer what the operation should do (e.g., which index is used) and offers more flexibility in defining the actions. I do admit I had to review the syntax in the documentation for MERGE while the INSERT..ON EXISTING is stuck in my mind, but I do prefer the MERGE. 1
Wow, that's good to know - I'll feel less embarrassed then each time I have to look that up:)
(02 Apr '14, 12:10)
Volker Barth
Is that the reason why a "unique value swap" is allowed by the server but will lead to problems for SQL Remote as the single UPDATE swap statement will be replicated as several one-row UPDATE operations that will then lead to UQ violations? (OK, that would be a better question on its own...). I'm thinking of an UPDATE (say, colUnique would be a candidate key) like
In my understanding, that will be replicated as
(02 Apr '14, 12:16)
Volker Barth
Comment Text Removed
1
Thanks for that answer, the explanation and for filing the bug report.
(03 Apr '14, 02:37)
MartinM
Comment Text Removed
1
> The MERGE statement is generally preferred to the INSERT..ON EXISTING as it clearer what the operation should do (e.g., which index is used) and offers more flexibility in defining the actions If you need flexibility (say, doing complex ETL stuff) then MERGE is a candidate. However, flexibility does NOT necessarily imply "generally preferred". In the general case (say, 80% of the time) folks need a simple UPSERT operation, and they are generally NOT confronted with rare challenges like swapping unique column values. Generally speaking, simplicity is preferred over complexity, and MERGE is nothing if not complex. The only reason anyone has to look up UPSERT is because of the unfortunate syntax (INSERT ON EXISTING UPDATE)... having the benefit of time, HANA got it right. Aside: The statement "HANA got it right" is open to debate. The examples for UPSERT in the HANA docs (currently found here) make UPSERT look good (e.g., UPSERT T VALUES (1, 1)) but the documentation is both unclear and contradictory. FWIW HANA doesn't have MERGE; it does have MERGE DELTA but that's a Monty Python feature ( something completely different :)
(03 Apr '14, 08:46)
Breck Carter
2
I should say that MERGE is generally preferred by me. I do enjoy when the server takes care of tedious typing for me and does "what I want" but with INSERT ON EXISTING I feel it takes too many liberties and does things that are not what I expected. Further, the INSERT is really doing a hidden join between the input table and target table, and that join can only be executed by effectively an index-nested-loops algorithm as each input row is considered for insertion. The MERGE statement has an obvious join (but doesn't require you do a lot of typing if you just want primary key matching). If the statement is mostly going to update existing rows , then the MERGE can be substantially faster by using another join algorithm. It might be interesting that the INSERT ... ON EXISTING syntax was added to expose an underlying "upsert" feature used only for SQL Remote. That underlying feature is still used and it has a few extra bells and whistles (such as detecting RECORDING_CONFLICT_DETECTED). The INSERT .. ON EXISTING is a good feature but I prefer the way the MERGE statement gives me the power without either requiring too much extra typing or guessing too much at my intention. By keeping the simple case in my commonly used tools, I'm ready when I do get to a trickier situation (what happens if there is a default or computed column that is part of the primary key? What happens when I don't want to join on the primary key?). I prefer to use MERGE even in simple cases but your preferences my differ.
(03 Apr '14, 09:57)
Ivan T. Bowman
One might add that MERGE is SQL standard while INSERT...ON EXISTING is not, however, in my (certainly not too deep!) understanding it's only the SQL Anywhere specific additions and defaults that cause the really complex MERGE statement to "look easy" and thereby make it useful for simpler cases. Therefore the "it's a standard!" claim should not be overrated:)
(03 Apr '14, 10:29)
Volker Barth
More comments hidden
|
Very interesting...
Using the MERGE statement might be helpful here - instead of simply relying on the PK to join the source and target table, you can explicitly specify the condition. That way you might separate the handling of rows with changed unique values from "ordinary updates" - and you can include the delete action in the same (atomic) statement.