I am responsible for a SQL Anywhere Mobilink system that I did not write, so excuse my ignorance.

This past, weekend a hard drive crash took down our main database which has a set of tables synced to dozens of user phones -- an iphone app is used to gather customer equipment data on-site. We use mobilink to do the sync. We have a database backup at the end of every weekday so we have not lost any data from the main server but how can I get the phones to resume syncing? Some of them have synced up but I am getting many errors (794 among others). Is there a way to have all the distributed users sync back to the master? I am willing to lose any changes on the phones to get things working again.

This surely must be a scenario your product was designed to handle. Right?

This question is marked "community wiki".

asked 11 Dec '17, 16:52

keithhooker's gravatar image

keithhooker
7114
accept rate: 0%

1

Welcome :-)

It might be helpful to include the errors from the Mobilink log file.

(11 Dec '17, 17:16) Tim McClements
1

The -794 error is reporting that some error has happened in MobiLink server. As Tim has indicated, the MobiLink server log is important in this case. I am going to guess that the MobiLink server log will indicate that there is a progress mismatchIf that is the case, I am going to suggest that the database was not fully recovered.

Are you absolutely sure that the consolidated was recovered to the time that the hard drive crashed. What is the consolidated database? If not SQL Anywhere, how does that database ensure that the recovery is to the last committed transaction in the case of a hard drive crash? If SQL Anywhere, was the transaction log damaged because of the hard drive crash. If not, was it applied to the backup.

MobiLink Synchronization is designed to handle cases where the database is fully recovered. Remotes will not sync if data is lost that is relevant to the status of their synchronization has been lost.

(12 Dec '17, 08:20) Chris Keating
4

excuse my ignorance

Thanks for the warning :)...

a worthless product

If you are actually looking for help, please provide the information requested (the dbmlsrv -o diagnostic log)

(13 Dec '17, 07:24) Breck Carter
Replies hidden
2

FYI One the site admins forwarded snipets of the MobiLink log from which the error. The errors include [-10400] Invalid sync sequence ID for remote ID which is to be expected if there was lost data related to the sync progress. There were also some [10106] Unable to lock the remote ID which suggest overlapping syncs from the same client.

(13 Dec '17, 11:28) Chris Keating

Summarizing information from the comments...

A system timeline, with consolidated database states Sn, restore XX, and lowercase remote changes uploaded:

condb   S0..S1..S2..S3...XX...S2..S2..S4
-           ^       ^             ^
remote1 ....a.......c.............d
-               ^                     ^
remote2 ........b.....................e

Remote1 will fail to synchronize its changes d, because it realizes that the consolidated database has lost its changes c. What can it do at this point? It stops to protect data integrity and external intervention is required. Options include:

  1. reset remote1's state on the server so it will synchronize despite the lost data. This will not work well if d depends on c or other post-S2 changes!
  2. obtain remote1's database and manually recover new data, or perform a sync that uploads all data. Whether upload-all is safe or not depends on how the sync scripts and condb are designed; it could overwrite new data with old.
  3. recreate the remote database and sync to obtain a fresh consistent copy of the data. If done without #2, changes c and d are lost.

Option #3 is the only sure way to achieve consistency. It's not automatic because it loses data.

Back to the timeline, remote2 will synchronize successfully, because from its point of view everything is consistent. It did not sync within the window of loss.

permanent link

answered 14 Dec '17, 12:46

Tim%20McClements's gravatar image

Tim McClements
1.9k1828
accept rate: 37%

So the Mobilink system has no way to recover and reset the sync? That would be an insane design and a worthless product.

Our database system is being hit all the time from different apps and sources. Returning to a state that it was in several days ago at the exact moment of a disk crash is not possible.

permanent link

answered 12 Dec '17, 09:04

keithhooker's gravatar image

keithhooker
7114
accept rate: 0%

Without knowing the specifics of what MobiLink server is reporting, I am reluctant to advise next steps. Certainly, you can reset the remote if that is the appropriate action. See ml_reset_sync_state system procedure. Please read and understand what this does before using it.

The point I was marking with respect to a recovery is that if there have been synchronizations since the point in time of the database that has been put into the system, those remotes will no longer be able to sync without some intervention. This is not ideal. I take it that the database that was put back into production was not restored to the point in time of the hard drive failure. If so, there will likely be remotes that cannot sync if they have sync'd at a time after that restore point i.e., you have lost the sync status that synchronization uses to ensure data consistency between the remote and consolidated. You can reset the sync state to force that remote to synchronize.

(12 Dec '17, 09:24) Chris Keating

Your log suggests that the reset procedure in theory should work with the affected users.

(12 Dec '17, 10:07) Chris Keating

Please hold off on the use of ml_reset_sync_state. We are looking at options on the remote.

(12 Dec '17, 10:36) Chris Keating
3

Please note that this is really a salvage because data has been lost at the consolidated - the data might simply be related to the current sync state of affected remotes or could extend to data that should be in the consolidated and may no longer exist on the remote. Given the flexibility of the scripts that are used for synchronization, MobiLink would not be able to determine what rows should look like - that is left to the consolidated and its recovery gear. This is not a defect in design. It is reasonable that MobiLink expects that the consolidated database is capable of recovery with no loss of data. That did not happen in this case.

You need to now decide 1) Is there data on the remote that may be important to the application and efforts should be made to get that data into the consolidated? If so, this may be a manually effort for the affected remotes. -or- 2) Are you willing to lose that data? If so, options include resetting the remote status or recreating those remotes by sync'ing the database from an empty state - in that case, keep the existing remote as you may be able to manually re-enter the information that exists only in the remote.

You may want to work with technical support to go through the details if you are not familiar with MobiLink.

(12 Dec '17, 13:36) Chris Keating
1

The design is not insane and the product is not worthless.
Operating it without an administrator who has the required skills and understanding .is. insane. This is probably not your fault, but it is one. If you get out of this with the help from somebody, you're still left with a system that, in the best possible situation, works, and you don't know why.
It is mandatory and critical that a node in a system of communicating databases is restored to a point where no communication with another database is missing. This is particularly critical for any consolidate database which communicates with all other nodes. All supported MobiLink consolidate platforms have mechanisms to achieve this, depending on the failure scenarios you need to cover and the infrastructure required to implement the required countermeasure. Determining this requires solid knowledge (not at rocket science level) about the way MobiLink works and the administration of the consolidate database.
I wish you good luck that somebody here or from TechSupport can help you out of this. If anybody can, Chris is a hot candidate. Whatever will be the outcome of this crisis, I highly recommend that you go to your manager and insist that you and probably somebody else get trained on the subject. If [s]he compares the cost for such a training with the risk of not having somebody available with the skills to handle a crisis, [s]he will be responsible for the decision.

Just my €.02...

Volker

(15 Dec '17, 21:01) Volker DB-TecKy
Replies hidden

It is mandatory and critical that a node in a system of communicating databases is restored to a point where no communication with another database is missing.

Just to add: Understanding such as system does also include the knowledge that in the (hopefully rare) case the mentioned requirement cannot be fulfilled (i.e. the consolidated database cannot be restored up do the point of the last sync), any of the remote databases that have sync'ed with the consolidated after that point may have lost data and/or may need to be resynchronized. That's what Chris and Tim have explained.

(17 Dec '17, 12:24) Volker Barth
showing 5 of 6 show all flat view
Your answer
toggle preview

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text](http://url.com/ "title")
  • image?![alt text](/path/img.jpg "title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported

Question tags:

×7
×3

question asked: 11 Dec '17, 16:52

question was seen: 334 times

last updated: 18 Dec '17, 17:34