I've finally migrated a 8.0.3 consolidated database to 12.0.1.3324, and nearly all is well.

The one strange thing happens with two 12.0.1.3324 remotes that I use to test with.

They have twice failed to replicate with the above error, such as:

I. COMMIT  
W. Deleting duplicate message from "cons" (13-01334747425-01334747425-0)  
W. Deleting duplicate message from "cons" (13-01335463391-01335463391-0)  
E. Messages (13-01334747425-01334747425-0) and (13-01334747425-01334811163-0)  
   from "cons" contain duplicate log operations  
I. execute "dbo".sp\_hook\_dbremote\_receive_end

Then the cons answers with a resend, and the resent messages (comprising several files) are then applied without problem. This happens with both v12 remotes.

Strange enough, I have "identical" remotes with 8.0.3 (i.e. with logically identical subscriptions) that do apply the same messages without error (and without need to resend).

AFAIK, the SQL Remote options are similar (e.g. -l is set to the same value - I'm using the DBTools API actually).

I've been using the v12 remotes for a while (since last November), and that kind of error has appeared now and then (and could obviously be resolved by a re-send in all cases). Therefore it does not seem to be bound to the cons having moved from v8 to v12.

Therefore I'd like to ask if there is a known problem with v12 and such errors...


Just to add:

Further analysis shows that this has not happened with 12.0.1.3152 and older versions, and it seems to happen when the remote gets more than one message with the same message header. In the samples, there were several "empty" messages with the following header:

I. Nachricht von "cons" (13-01334747425-01334747425-0) erhalten  
I. Nachricht von "cons" (13-01334747425-01334747425-0) erhalten

This seems to be seen as an error with 12.0.1.3298 (and 12.0.1.3324) whereas older versions do apply messages with identical headers without problem, here from the log of the 8.0.3 remote:

I. Nachricht von "cons" (1-01334747425-01334747425-0) erhalten  
I. Nachricht von "cons" (1-01334747425-01334747425-0) erhalten  
...  
I. Nachricht von "cons" (1-01334747425-01334747425-0) wird angewendet  
I. Nachricht von "cons" (1-01334747425-01334747425-0) wird angewendet

Note 1: Sending messages with identical headers is very common in our setup, as we use SQL Remote in batch mode, and therefore each remote that has not "answered" currently will soon get a "liveness" message for each SQL Remote run in the cons.

Note 2: A remote with 10.0.1.4181 seems to have the same behaviour.

asked 27 Apr '11, 13:40

Volker%20Barth's gravatar image

Volker Barth
30.0k294448654
accept rate: 32%

edited 28 Apr '11, 11:31

Just to note: I seem to be able to reproduce the described behaviour when using the DBRemoteSQL() API (as the remotes do usually). On a test system with 12.0.1.3324 and usage of DBREMOTE.EXE, those messages with identical header are applied correctly. - Has there been a change to the API lately (I have linked against 12.0.1.3152 (MR))?

When comparing the DBREMOTE options and the API flags, they are mostly identical - a few exceptions being the API set to -w 1 and -g 10 - (instead of the default -w 0 and -g 20) but that has been in use since years.

(28 Apr '11, 11:56) Volker Barth

Volker, can you post a dbremote log showing a complete run where this error is reported?

This might be tough to debug, since the error will occur saying that there are duplicate messages, and that results in wanting to see the actual message files, that SQL Remote will just have applied and deleted.

(28 Apr '11, 14:46) Reg Domaratzki
Replies hidden

I'm gonna send some samples directly to you. That will also contain copies of message files that will show that behaviour.

(29 Apr '11, 05:36) Volker Barth

The problem seems to be bound to the setting of the -w option. I have run the same messages against the same v12.0.1 remote with DBREMOTE.EXE and just switched from -w 0 to -w 1, and the problem appears.

(29 Apr '11, 07:26) Volker Barth

I'm not going to have any cycle in the next few days to look at an in-depth problem like this, as I'm currently working an escalated issue. If you need a guaranteed response time, you'll need to open a technical support case, or even better, you could post the information you sent to me directly on the forum, in case somebody else would like to comment on it.

(29 Apr '11, 10:35) Reg Domaratzki
Replies hidden

Reg, it's no immediate problem, and the logs do contain data that I do not like to share in public.

Therefore I'm just giving some more info (based on the mail to you) in a separate answer - just in case:)

(29 Apr '11, 10:41) Volker Barth
showing 3 of 6 show all flat view

The problem of dbremote reporting "Deleting duplicate message" has been fixed in the following builds :

v11.0.1.2615, v12.0.0.2691, v12.0.1.3365

We strongly suspect but cannot prove that the "duplicate log operations" is the same problem, but have not been able to reproduce that problem here.

permanent link

answered 02 Jun '11, 08:49

Reg%20Domaratzki's gravatar image

Reg Domaratzki
5.1k33573
accept rate: 40%

Should also have mentioned that the CR number for the fix is 672208.

(02 Jun '11, 09:34) Reg Domaratzki
Replies hidden

And here's the link. - Thanks Reg, I'm gonna report once an EBF is available.

BTW: Is there any chance to fix this in 10.0.1, too, as the underlying change (CR 655157) has been introduced in 10.0.1.4171, too? I'm aware that 10.0.1 is in limited support now, but ...

(02 Jun '11, 18:22) Volker Barth

@Reg: Are there any plans to fix this in 10.0.1, too? I really would like this code line to get fixed, too.

Otherwise, at least I would know whether I will need to change my DBTools structure settings for our 10.0.1 clients.

(17 Jun '11, 17:14) Volker Barth

@Reg: Just to confirm that EBF 12.0.1.3389 really has solved this issue.

From the quietness around this answer, I'm going to assume that a fix for 10.0.1 is not planned.

(13 Jul '11, 17:06) Volker Barth
Replies hidden

Just to add: The new 12.0.1 EBF also prevents the "duplicate log operations" message whereas a similar remote running with the unfixed 10.0.1.4181 still does yield this error message.

(13 Jul '11, 17:14) Volker Barth

Further analysis shows that this might be bound to the number of worker threads for applying messages:

It occurs that when -w is set to 0 (the default), the messages with identical header are applied correctly (and each message is applied directly after receiving). In contrast, using -w 1 (as our DBTools-API is set to) receives all messages and then starts to apply them asynchronously (which is desired and the obvious effect of using a separate worker thread for applying).

This has worked fine 12.0.0.2566 - 12.0.0.2601 and with 12.0.1.3152 (MR) - and has worked for years with v8.

However, the parallel application seems to lead to problems with 12.0.1.3298 and above, as the applying thread seems to treat messages with identical headers as wrong and then triggers the warning about duplicate messages.

10.0.1.4181 seems to have this issue, too.

When using no separate worker thread (-w 0), each message is received and applied on its own (as long as the order is correct). As the applying code does only treat one message at a time it cannot notice that there are duplicate identical message headers, and therefore does not trigger a warning. As such, using -w 0 seems a valid workaround.

So I would conclude that something has changed in the analysis of message headers in recent builds that causes this issue.

The issue should be reproducible with any SQL Remote setup fo the according builds, if a cons sends several messages to the same remote (during several batch runs) when the log offset in the cons has not changed in-between. That will send messages with identical headers (i.e. the unchanged log offset) to the remote.

permanent link

answered 29 Apr '11, 10:48

Volker%20Barth's gravatar image

Volker Barth
30.0k294448654
accept rate: 32%

edited 29 Apr '11, 10:53

@Reg: I do not expect many other members here to be able to answer my question (as I guess they would need to have the SQL Remote code available...) - this is just in case your current workload gives you a chance to dive into this - let me know:)

(07 May '11, 16:22) Volker Barth

Volker, I've been able to reproduce the warnings, and as you point out, it only occurs when the -w switch is used, and in my limited testing it's inconsistent on when the warning is displayed and when the message is applied (timestamps removed in the output below so it all fits nicely on one line).

I. Applying message from "rem1" (0-0000564123-0000564123-0) W. Deleting duplicate message from "rem1" (0-0000564123-0000564123-0) W. Deleting duplicate message from "rem1" (0-0000564123-0000564123-0) W. Deleting duplicate message from "rem1" (0-0000564123-0000564123-0) W. Deleting duplicate message from "rem1" (0-0000564123-0000564123-0) I. Applying message from "rem1" (0-0000564123-0000564123-0) I. Applying message from "rem1" (0-0000564123-0000564123-0) I. Applying message from "rem1" (0-0000564123-0000564123-0) I. Applying message from "rem1" (0-0000564123-0000564123-0)

I still haven't seen the error message from your first post, but I'll look into why the warnings are being printed and see if I can follow a code path that gets me to the error.

I strongly suspect that this behaviour is an unwanted side effect to the change that we put in versions 10.0.1.4171, 11.0.1.2547, 12.0.0.2634 and 12.0.1.3276 :

Originally, if there are any missing messages, SQL Remote would ask for a resend after it has reached its receive polls given by the -rp option. This resend logic could cause the publisher to re-scan the transaction log(s) and slows down replicating new transactions, especially on heavy load databases.

Now, when a message or messages in a multi-part message series (SQL Remote will generate multiple messages for a single transaction to form a multi-part message when the transaction is too big to fit in a single message) are missing, SQL Remote will not immediately ask for a resend, if the received messages are not followed by any messages that contain a commit or any messages that belong to another multi-part message series. This new logic will help users who need to shut down or kill SQL Remote when it is sending multi-part messages to its subscribers.

(11 May '11, 16:33) Reg Domaratzki
Replies hidden

FWIW, the CR number Reg refers to is 655157.

(12 May '11, 04:13) Volker Barth

Anything new to this topic?

Should I plan to build a workaround (by adapting DBTools to use the "-w 0" setting), or is a fix expected to come?

(29 May '11, 17:16) Volker Barth

This problem (CR 672208) should have been fixed in the following versions, 11.0.1 (EBF 2615), 12.0.0 (EBF 2691), 12.0.1 (EBF 3365). Please ask for a newer EBF.

(14 Jul '11, 10:08) Yufei Guo
Replies hidden

@Yufui: I'm aware of that - Reg has posted the CR number and more in his answer and comments below.

And there I do have asked for an additional fix in 10.0.1 - but surely I can ask again here:)

(14 Jul '11, 10:16) Volker Barth
showing 3 of 6 show all flat view
Your answer
toggle preview

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text](http://url.com/ "title")
  • image?![alt text](/path/img.jpg "title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported

Question tags:

×409
×76

question asked: 27 Apr '11, 13:40

question was seen: 1,462 times

last updated: 14 Jul '11, 10:16