Is this expected behavior when an update on an HA primary database collides with an isolation level 3 query on the secondary read-only database? (yes, see Volker's answer)

If so, are there any recommendations for writing robust queries for use on a secondary read-only database? (yes, see Volker's answer)

Could not execute statement.
Rollback occurred due to deadlock during prefetch
SQLCODE=-684, ODBC 3 State="40001"
Line 1, column 1

SELECT EmployeeID, 
       Salary
  FROM Employees
 ORDER BY EmployeeID

alt text

Background:

Two 12.0.1.3298 servers are executing in an HA setup.

A query has been executed on the secondary server in read-only mode using isolation level 3:

SELECT EmployeeID, 
       Salary
  FROM Employees
 ORDER BY EmployeeID;

An UPDATE has been executed on the primary server:

UPDATE Employees
   SET Salary = Salary + 0.01;
COMMIT;

When the same query is re-executed on the secondary server, it seems to work UNTIL the updates are applied, then it fails with the deadlock:

SELECT EmployeeID, 
       Salary
  FROM Employees
 ORDER BY EmployeeID;

A TYPE DEADLOCK event exists on both databases, but it is not fired.

This command does not help when run on the secondary server: SET TEMPORARY OPTION PREFETCH = 'OFF';

Note also that an EXCEPTION handler will not be fired by the deadlock.

The behavior is the same in 16.0.0.1512

asked 25 Oct '13, 16:38

Breck%20Carter's gravatar image

Breck Carter
25.7k427587847
accept rate: 20%

edited 26 Oct '13, 08:56


According to the docs (yes, I do not have tested this!), it's expected behaviour that a query against a read-only mirror server will be roll backed when it interferes with applied transactions from the primary server - to cite:

Queries that are executed against the mirror database can place locks, depending on the isolation level specified. If locks interfere with operations being applied from the primary server, then the connections holding the locks have their transactions rolled back and any open cursors for those connections are closed. Applications running at isolation level 0 do not add row locks, but still acquire schema locks. If the schema locks interfere with operations being applied from the primary server, the transaction on the mirror database is rolled back.

So it seems that the internal connection applying the transactions from the primary has always a higher priority. That's the Watcom way of how it should be, methinks.

And I guess the same doc page also names the remedy - snapshot isolation:

Applications that require a consistent view of the database (and so cannot use isolation level 0) should consider using snapshot isolation. To do so, the allow_snapshot_isolation option must be set to On. This option takes effect on both the primary server and the mirror server, so the costs associated with snapshot isolation need to be considered.

permanent link

answered 26 Oct '13, 04:10

Volker%20Barth's gravatar image

Volker Barth
30.3k301454660
accept rate: 32%

edited 26 Oct '13, 04:12

Thanks, Volker! I actually remember reading parts of that section recently, but didn't associate it with this symptom... then when I went searching, I came up empty.

It is important to note that an EXCEPTION handler will NOT be fired, nor will a TYPE DEADLOCK event... the rollback is all-powerful.

FWIW re-testing in V16 yields identical results.

(26 Oct '13, 08:54) Breck Carter
1

You are correct Volker. The idea is that we don't want to block the recovery thread from doing its work on the mirror (or copy node) since, if we did allow the recovery to block, this could adversely affect the time it would take to failover in the case that the primary went down (and we want failover to be as fast as possible).

Breck: The exception handler does not fire for the same reason stated above - we don't want to delay the recovery thread in any way.

Having said the above, I/we are not sure why a DEADLOCK error is being thrown? (but I may have answered this later in this paragraph!) What really happens is that the recovery thread sends a CANCEL to the connection that is holding the lock (that the recover thread needs to get past) and a ROLLBACK is queued on that same connection so the locks are released. FWIW the same sequence occurs when a thread deadlock is detected so this is likely why the error is being reported as a DEADLOCK. We should likely attempt to change the server to throw a different error in the case that it is the recovery thread that is cancelling the connection's transaction?

In your particular case the prefetch operation was the 'unlucky' operation that was occurring when your connection got cancelled. Disabling prefetch will have no effect; the cancel will simply be noticed and reported by some other operation.

(26 Oct '13, 09:08) Mark Culp
Replies hidden
1

The screenshot actually shows what happens when SET TEMPORARY OPTION PREFETCH = 'OFF' is in effect... I'm guessing the description of SQLCODE -684 needs work :)

IMO the intentions are noble, but the effects on the read-only application are too extreme. It's one thing to ram through the recovery operations, it's another to suppress the TYPE DEADLOCK event... what possible motivation can there be to suppress an independent connection from running? It is just a diagnostic tool, and it cannot have any bad effect on the recovery process.

I'm also not convinced the EXCEPTION should be suppressed. Why can't it be fired after the deadlock has been processed and the offending transaction rolled back?

"Read-only" is not a synonym for "Unimportant".

(27 Oct '13, 09:27) Breck Carter
Your answer
toggle preview

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text](http://url.com/ "title")
  • image?![alt text](/path/img.jpg "title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported

Question tags:

×51

question asked: 25 Oct '13, 16:38

question was seen: 5,401 times

last updated: 27 Oct '13, 09:27