SQL-Remote msg agent not working during HA failover

Hi Everyone,

We have setup a SQL Anywhere HA along with SQL Remote replication on separate hosts. The HA primary and mirror acts as the consolidate database and we have a separate SQL Anywhere database acting as the SQL Remote remote database. Please find the details of the environment below:

Name of the SQL Anywhere instances: HA Primary: MKMSD1(Current Primary) HA Mirror: MKMSD2(Mirror server) HA Arbiter: MKARB(Arbiter server) SQL Remote remote database: MKREPDEV

All the above SQL Anywhere instances are running on version 17.0.0.4095.

The replication from the HA primary(MKMSD1/MKMSD2) to the SQL remote remote database - MKREPDEV is working fine. The message agents are running on the hosts running the HA primary database and the SQL remote remote database.

We have setup events on the SQL Anywhere primary database to stop and start the SQL Remote message agents.

The first event gets fired during a HA failover on the primary node using the system event - MirrorFailover. This event starts a new message agent on the new HA primary.
The second event runs every 1 second to check the status(primary/mirror) of the SQL Anywhere server in the HA setup and the ReadOnly database property of the database. If the node is a mirror node and the ReadOnly property is On, then the message agent that was running on the host gets killed.

The HA failover was tested to see if the message agent gets shutdown on the mirror(previous primary) node and a new message agent starts on the new primary node(previous mirror). The message agents are getting fired as expected on the primary node of the HA during a failover.

Problem/Issue:

The replication works fine from the initial HA primary node from where the SQL Remote replication was setup. Post failover from the initial HA primary node to the new HA primary node, the message agent starts but gets stuck with the message - "Reading active portion of the transaction log". If we fail back to the initial HA primary node, the SQL Remote message agent again starts reading new transactions and everything works fine. So, the SQL Remote replication works with only the first HA primary node(where SQL Remote replication was setup) and not with the new HA primary after a failover.

We are unsure how to make the message agent work on the new primary node post failover.

Please help us with your valuable suggestions and advices to resolve this issue.

Many thanks.

dbremote sql-remote high-availability

asked 16 Nov '21, 11:23

nandakumark1989
56●2●2●6
accept rate: 0%

edited 16 Nov '21, 13:33

I have never used SQL Remote in a HA setup, so just a thought:

You basically seem to control two SQL Remote instances running on different machines connecting to two different databases, and only one is allowed to run at a time. You do control that from the database servers themselves via events.

As SQL Remote is "just a client app", would it work better if you treat it like a client app, i.e. have one instance running (possibly on the arbiter or a separate machine) connected to the "primary database" (just like a normal app would do, i.e. providing the alternate server name and specifying both HOST names in the connection string), and let it loose its connection once a HA failover appears, and then (in a loop) make it re-connect to the new primary database?

As stated, I have not tried it myself, and apparently the one machine running the message agent could introduce a single point of failure, and you might need to adapt DBREMOTE's command line by pointing to a different log directory.

(17 Nov '21, 01:51) Volker Barth

Replies hidden

FWIW, I found this KB doc on SQL Remote and HA, to cite:

Q: Can we implement replication using SQL Remote and SQL Anywhere as consolidated or remote database in High Availability ?

A: It is technically possible. The main challenge is getting access to and maintaining offline transaction logs. dbremote (but it is true for dbmlsync as well) will assume that offline transaction logs can be found in the current running directory, so you have to plan carefully your backup/recovery strategy or your offline transaction log management strategy.

(17 Nov '21, 01:55) Volker Barth

Replies hidden

Hi Volker, Thanks for the reply.

You are right. I am using the virtual server name for the primary(alternate name) in the connection strings for the SQL Remote message agent on the consolidate side. So, it always connects to the current primary database.

The message agent starts automatically via the system event(MirrorFailover) as stated above but the only problem is that it continues reading active portion of the transaction log on the new primary. I tried setting the debug option for the remote user to yes but it did not reveal anything in the message agent log.

I made some important observations today.

The environment where we were setting up the HA and SQL-Remote replication did not have any backup plan. So, the log file grew to nearly 38GB in size. My guess is that the message agent is reading all the transactions from the very beginning but I am not very sure about it because when the message agent starts on both HA nodes it says it is starting reading the transaction log from the same transaction offset.

If we setup a backup script to backup the transaction logs on the HA primary/mirror, I will set the option delete_old_logs to true to prevent the message agent from having to read a huge log file and maintain minimal number of active and offline transaction log files. I am unsure at this point how the message agent will behave during the log backup and truncation. I have to test all these scenarios.

(17 Nov '21, 03:24) nandakumark1989

I went through the KBA and it gave me an assurance that it is technically possible. I could not find this kind of a setup documented in SQL Anywhere forum/SAP support. So, I hope it will be a first if we get it working successfully. Thanks for all the inputs. It is very encouraging.

(17 Nov '21, 03:37) nandakumark1989

The environment where we were setting up the HA and SQL-Remote replication did not have any backup plan.

Oops, using SQL Remote without a backup plan... - I'd highly recommend to address that before thinking about HA....

AFAIK, the number of offline logs is not critical for the SQL Remote performance, the message agent will start to scan logs at the last offset it has sent (which will usually be in the active log), and will only come back to offline logs when a remote requires a re-send of older messages. Nevertheless, using DELETE_OLD_LOGS is reasonable, possibly with a number of days or delay to make sure it fits the backup plan.

(17 Nov '21, 03:53) Volker Barth

flat view

Be the first one to answer this question!

toggle preview

community wiki:

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

Markdown Basics

*italic* or _italic_
**bold** or __bold__
link:[text](http://url.com/ "title")
image?![alt text](/path/img.jpg "title")
numbered list: 1. Foo 2. Bar
to add a line break simply add two spaces to where you would like the new line to be.
basic HTML tags are also supported

learn more about Markdown

Question tags:

sql-remote ×103
dbremote ×78
high-availability ×61

question asked: 16 Nov '21, 11:23

question was seen: 578 times

last updated: 17 Nov '21, 03:53

SAP SQL Anywhere

SQL-Remote msg agent not working during HA failover

Follow this question

Related questions