Hi, I am using sql remote for replicating between two distanced databases, where for some reason the replication is not succeeding anymore, and the Consolidated DB argues <missing message="" from="" remote="" db="">. Question: is there any way to enforce the Remote DB to jump back in transaction log files, and send older messages again to the Consolidated? So far as I know, if I could change the field 'log_send' in table SYSREMOTEUSERS in Remote then I will have what I want.

Could you please tell me whethere there is any procedure that could edit the table SYSREMOTEUSERS?

Thanks in advance

asked 08 May '17, 06:51

Baron's gravatar image

Baron
2.1k134146174
accept rate: 46%

edited 08 May '17, 06:55

3

First a few observations:

  1. Missing message conditions should be self correcting after a few cycles. This is part of the SQL Remote guaranteed delivery mechanisms.
  2. You may not be able to resend from an earlier offset if you don't have the offline logs still available.
  3. Modifications to SYSREMOTEUSER is not a user available feature. You can work with product support to figure out what your best course of action should be {which may not be this approach}.

What does dbremote report on the other end? Missing message on one side does not identify a cause for this or provide any details about the state of the remote; and you may need to take some action from the other side instead.

Do you know what activity lead up to this? Backups? Deletion of offline logs? Restore from backups? Some sort of system failure?

(09 May '17, 13:16) Nick Elson S...

Thanks for the reply, I tried several several cycles, but the consolidated insists that there are missing messages (from offset x to offset y) from Remote. The output file of executing dbremote.exe on remote side shows that it is sending to Consolidated messages (with the same offset range which the consolidate is looking for).

In other words, the consolidated waits for a range of offset, the remote send exactly this range but the consolidated refuses to take them.

(09 May '17, 13:24) Baron
Replies hidden

So show us the contents of sysremoteuser for both sides...

(09 May '17, 15:39) Volker Barth

Remote:*

log_send: 20869149 log_sent: 20869149 confirm_sent: 17407713 resend_count: 289

Consolidated:*

log_received: 17407713 confirm_received: 17407713 rereceive_count: 290

Error on consolidated:*

Missing message from user (289-0017407713-0020858324-0)

PS1: the output file of dbremote on remote tells that it has sent already messages within this range of offsets to consolidated (exact text is no more available).

PS2: none of above mentioned activities has caused this failure (restore from backup, deleting log files, system failure) but most probably some messages could be lost on the way travelling from Remote to Consolidated.

(10 May '17, 02:40) Baron
Replies hidden

Well, according to the offsets, both parties agree that the offset 17407713 has been applied and confirmed by the cons.

What does the opposite values tell (_received values on the remote, _send/*_sent values on the cons)? Does the remote have newer contents to send to the cons? (Otherwise, it may take a while/some SR cycles to sent new messages...

What is the exact -v output from DBREMOTE on the remote?

(10 May '17, 03:30) Volker Barth

Remote:*

log_send: 20,869,149 log_sent: 20,869,149 confirm_sent: 17,407,713 resend_count: 289 log_received: 93,281,255,108 confirm_received: 93,281,255,108

Consolidated:*

log_send: 93,281,265,392 log_sent: 93,281,265,392 confirm_sent: 93,001,593,042 log_received: 17,407,713 confirm_received: 17,407,713 rereceive_count: 290

(10 May '17, 03:58) Baron
Replies hidden

One important questions is: How does the message agent decide which range of offset to send to the subscriber? According to trivial logic I can figure out that it should send the range (confirm_sent -->> offset of last commit in transaction log files). What will mean then the values of log_sent, log_send? Is it only for calculating the resend count and avoiding double replication?

Could you show me a good documentation rather than the standard documentation of SYBASE?

Thanks

(10 May '17, 04:31) Baron
Replies hidden

So while both parties agree on the offset sent by the remote, the disagree about the confirmed log offset sent by the cons (93,281,255,108 as to the remote, only 93,001,593,042 as to the cons), so the cons has not yet received the confirmation by the remote.

Still waiting for a DBREMOTE -v output...

(10 May '17, 04:58) Volker Barth

I usually consider SQL Anywhere documentation "good" :)

FWIW, to learn about offset handling, I'd highly recommend to set up a small test environment with 2-3 remotes and study the contents of SYSREMOTEUSER (or the DBREMOTE -v output) between several runs. And just simulate a "missing" message by deleting it before it is applied by the according site. (Studying the current log contents via DBTRAN -sr ... helps, as well).

Hard/boring stuff, probably, but then again, it's basically a simple system...

(10 May '17, 05:03) Volker Barth

sorry for the invonvinience, but the -v output file is at the moment not available (and it is in German), I will try to copy it from a remote computer later. Actually I solved the problem in a manual way, by carrying the changes manually and build a whole replication from begining.

Thanks for the support anyway, and will append the file as soon as available.

(10 May '17, 05:14) Baron
Received message from "DBSRV2_user" (1-093281255108-093281255108-0);
Applying message from "DBSRV2_user" (1-093281255108-093281255108-0);
Sending message to "DBSRV2_user" (289-0017407713-0017573658-0);
Sending message to "DBSRV2_user" (289-0017573658-0017833440-0);
Sending message to "DBSRV2_user" (289-0017833440-0018691971-0); 
Sending message to "DBSRV2_user" (289-0018691971-0018888948-0); 
Sending message to "DBSRV2_user" (289-0018888948-0019100640-0); 
Sending message to "DBSRV2_user" (289-0019100640-0020858324-0); 
Sending message to "DBSRV2_user" (289-0020858324-0020869149-0); 
###*******########
Received message from "DBSRV2_user" (1-093196040590-093196049435-0);
Applying message from "DBSRV2_user" (1-093196040590-093196049435-0);
Sending message to "DBSRV2_user" (210-0017407713-0017573658-0);
Sending message to "DBSRV2_user" (210-0017573658-0017833440-0);
Sending message to "DBSRV2_user" (210-0017833440-0018691971-0);
Sending message to "DBSRV2_user" (210-0018691971-0018888948-0);
Sending message to "DBSRV2_user" (210-0018888948-0019100640-0);
Sending message to "DBSRV2_user" (210-0019100640-0019193705-0);
###*******########
Received message from "DBSRV2_user" (1-093095582738-093095582738-0);
Applying message from "DBSRV2_user" (1-093095582738-093095582738-0);
Sending message to "DBSRV2_user" (61-0017407713-0017573658-0);
Sending message to "DBSRV2_user" (61-0017573658-0017833440-0);
Sending message to "DBSRV2_user" (61-0017833440-0017933664-0);
###*******########
Received message from "DBSRV2_user" (1-093095568557-093095575285-0);
Applying message from "DBSRV2_user" (1-093095568557-093095575285-0);
Sending message to "DBSRV2_user" (59-0017933664-0017933664-0);
###*******########
Received message from "DBSRV2_user" (1-093001593042-093001640470-0);
Applying message from "DBSRV2_user" (1-093001593042-093001640470-0);
Sending message to "DBSRV2_user" (0-0017407713-0017458277-0);
(10 May '17, 06:33) Baron

Sorry, but couldnt find any way to attach a file. I have copied the extraction from several output files belonging to this remote DB (resend=0, 59, 61, 210, 289). As you can see the receiving process on this Remote is growing, but the cons. rejects all messages coming from this remote.

(10 May '17, 06:37) Baron

German text would not be a problem for me:)

(10 May '17, 07:08) Volker Barth

I have already translated them to english. Hope that the extracted part is enough for you to figure out the situation.

(10 May '17, 07:13) Baron

No need to attach files here, the output as text is easier to read, particularly when formatted "as is", i.e. within a pre-tag-pair, like I've added now...

That being said, I do not really understand the DBREMOTE output snippets... - it would be helpful if they are arranged chronologically from oldest to newest, which seems not to be the case (and some messages seem to miss, is that left out intentionally?).

(10 May '17, 07:14) Volker Barth

I tried to arrange them, so the upper block is the newest one with resend=289, and the lowest one is the oldest with resend=0. Yes intentionally I jumped between files (in total more than 290 files).

The interesting thing I think is with the resend=59, as you see at this stage the remote is sending another range of messages starting from 0017933664. Please let me know what can I do else in order to help you have better understanding of the situation.

(10 May '17, 07:20) Baron

Note, nobody here has claimed to be able to "understand" (and explain) the situation - that would usually require access to the system and the remote logs and the like...

However, what does the cons tell (DBREMOTE -v) when those messages with offset 0017407713 are received there? Apparently the remote has sent them several times, so the cons should have received them - and should then tell why it possibly does discard them...

(10 May '17, 07:41) Volker Barth

Sorry for using the inconvinient expression, I thank you for your efforts anyway. Below is a copy of a complaint from Cons. for the resend=289.

35:E. 2017-05-08 11:59:57. Fehlende Nachricht von "DBM114_user" (289-0017407713-0020858324-0)

(10 May '17, 07:51) Baron

Well, what are the previous and following DBREMOTE output rows indicating, i.e. the messages from that remote that the cons has received (and does not apply)?

I would expect the cons to tell that it has received "newer messages" (with a higher offset or a different resent count) by that remote, so the cons has to reject those...

(10 May '17, 07:58) Volker Barth

the complete file is unfortunately not available (always overwritten) so that I have only snippets of it. What am I thinking about is the following: -The remote is sending 7 messages (0,017,407,713 to 0,020,869,149). - The cons. tells that it is expecting messages from remote (0,017,407,713 to 0,020,858,324) i.e. the first 6 messages from remote. -Could it happen that due to any reason the cons. processes the 7th message before processing the first 6 message) so that it stops and asks for the first 6 messages?

(10 May '17, 08:32) Baron

Could it happen that ...

Not under normal circumstances, AFAIK, messages from the same remote are read in "incoming order" and then internally sorted and applied in correct order. IIRC, the log might show that "Received messages" were in "wrong" order but were applied in correct order. So that mentioned situation should not lead to a missing message.

(10 May '17, 10:30) Volker Barth

OK, thanks for the efforts. I want to close the thema, as I have already manually solved the problem, but just wanted to have more understanding so that in future I can solve problems faster.

(10 May '17, 10:34) Baron
1

the complete file is unfortunately not available (always overwritten)

That might be something to handle differently in order to increase understanding in case of upcoming problems:)

Feel free to add an answer here (you solved the situation by re-extract, right?) and accept that to show the case is closed

(10 May '17, 11:01) Volker Barth
More comments hidden
showing 5 of 23 show all flat view

To answer your original question: There is a builtin tool to adapt the contents of SYSREMOTEUSER (and there are a few questions here dealing with that) but AFAIK, it's generally "left as an exercise to the reader to look for that", in order to make sure one only does use that when the particular situation and the consequences are fully understood. - Yes, I'm intentionally vague here.

permanent link

answered 10 May '17, 03:33

Volker%20Barth's gravatar image

Volker Barth
40.1k361549819
accept rate: 34%

converted 10 May '17, 11:59

Finally I could figure out the reason for the error. It was our own mistake: the message length (option -l) was not the same on consolidate and remote :-(

As a result, not identical message length on both sides can work sometimes (as long as the generated message is shorter than the value specified in -l).

Thanks and sorry for the inconvinience

permanent link

answered 22 May '17, 05:34

Baron's gravatar image

Baron
2.1k134146174
accept rate: 46%

2

Thanks for posting the resolution... you won't be the last person needing the answer.

(22 May '17, 08:01) Breck Carter

FWIW, how did you find out about that? Did the verbose SQL Remote log reveal anything about invalid (i.e. "too long") messages or did you notice that those where silently ignored (I hope for the former...)?

(23 May '17, 15:05) Volker Barth
Replies hidden
2

In this scenario, dbremote with no verbosity should either report "Message too large" or "Deleting corrupt message". I believe the warning message will vary based on whether the compressed data is less than or greater than the maximum allowed message size on the receiving end.

Reg

I. 2017-05-23 15:14:09. SQL Remote Message Agent Version 16.0.0.2438 
I. 2017-05-23 15:14:09. Copyright stuff removed
I. 2017-05-23 15:14:09. 
W. 2017-05-23 15:14:10. Message too large
W. 2017-05-23 15:14:10. Message too large
W. 2017-05-23 15:14:10. Deleting corrupt message
W. 2017-05-23 15:14:10. Deleting corrupt message
W. 2017-05-23 15:14:10. Deleting corrupt message
W. 2017-05-23 15:14:10. Deleting corrupt message
W. 2017-05-23 15:14:10. Deleting corrupt message
W. 2017-05-23 15:14:10. Deleting corrupt message
W. 2017-05-23 15:14:10. Deleting corrupt message
W. 2017-05-23 15:14:10. Message too large
W. 2017-05-23 15:14:10. Deleting corrupt message
W. 2017-05-23 15:14:10. Deleting corrupt message
W. 2017-05-23 15:14:10. Message too large
W. 2017-05-23 15:14:10. Message too large
I. 2017-05-23 15:14:10. Scanning logs starting at offset 0000693858
I. 2017-05-23 15:14:10. Transaction log "C:/170523_dbremote_toolong/cons/cons.log" starts at offset 0000672740
I. 2017-05-23 15:14:10. Processing transaction logs from directory "C:/170523_dbremote_toolong/cons/"
I. 2017-05-23 15:14:10. Processing transactions from active transaction log
I. 2017-05-23 15:14:11. Execution completed
(23 May '17, 15:22) Reg Domaratzki

OK, that should enable us to recognize the cause of that problem (at least when one is aware of the -l option:)...). Thanks for the clarification!

(23 May '17, 15:35) Volker Barth

Actually in my case I didn't receive in the verbose log anything indicating that the message is too long! Should I understand that the errors/warnings in the output file of dbremote should be: Either - Deleting corrupt message: when the subscriber is receiving messages that are shorter than expected. OR - Message too loarge: when the subscriber is receiving messages that are longer than expected.

How can I see both messages in the above example? is it tooken from the same side?

(24 May '17, 12:01) Baron
Replies hidden

In my understanding, it depends on the size of the compressed data length of each message file which warning will appear. I guess Reg has used both kinds in his test (a message too long in compressed form and another one only too long after being uncompressed), so both warnings appear. As stated, just my guessing.

(24 May '17, 12:29) Volker Barth

This is a run receiving messages that are too large. To be honest, I'm not 100% sure why the message is sometimes "Message too large" and sometimes "Deleting corrupt message". I'd need to investigate further, but IMHO, it's not critical to know exactly why. A warning is printed if dbremote picks up a message that it doesn't recognize as a message.

If you run dbremote with the -o switch, you won't see warnings in the dbremote window, you'll only see them if you look at the generated output log. I suspect that may be why you don't see the messages.

Reg

(24 May '17, 15:25) Reg Domaratzki
showing 3 of 7 show all flat view

Actually I have dropped the replication on remote side and created new subscription for this remote on the cons. We have ready written procedures for that. I will still try to simulate similar situations and sure will post the conclusion as soon as I have any.

permanent link

answered 10 May '17, 11:09

Baron's gravatar image

Baron
2.1k134146174
accept rate: 46%

edited 10 May '17, 18:04

1

Hint: I turned your comment into an answer. Now you can just click the check mark left of the text to "accept that".

(10 May '17, 11:19) Volker Barth
Your answer
toggle preview

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text](http://url.com/ "title")
  • image?![alt text](/path/img.jpg "title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported

Question tags:

×103
×60

question asked: 08 May '17, 06:51

question was seen: 2,438 times

last updated: 24 May '17, 15:25