Version 11.0.1 2825. Windows server 2003. I have 20 databases running on a WAN syncing every 10 minutes. I originally setup dbmlsync as a service scheduled to sync every 10 minutes but I have had problems with certain sites freezing and then I would have to restart the service to fix it. A couple of months ago, I changed to the database scheduler that shells out and runs dbmlsync with the appropriate options including the -qc option to close down dbmlsync with complete. This is actually working better but there are a couple of locations that still has the dbmlsync process freeze and I will have to go into task manager and kill the process. I have also tried changing the widnows tcpip keep alive TcpipParameters key to 5 minutes but that didn't seem to help much. I should mention that the problem locations don't have very reliable network services. I was wondering if anyone had an idea of either what could be the problem and a fix or if I should be doing this different. Thanks. |
I would like to follow up on this to let everyone know that timeout=600 seemed to fix the issue. Thanks. |
Can you post a dbmlsync log that shows this? Are you setting the timeout protocol option in dbmlsync? If so, what to?
I hadn't seen the timeout option before. One question, if I set it at say 600 seconds, does that meant that the connection will fail if sync takes longer than 600 seconds even though there is not a connection problem? Thanks.
If you set the timeout to 600s, then if the remote goes 300s without reading anything from the server it will send a "I'm still here" message to the server, which will respond with "Me too". If the remote hasn't heard back from the server for more than another 300s after sending the "I'm still here" message then it concludes the server is down and will abort the sync and print some kind of timeout message to the log. Similarly, if the server ever goes 600s without hearing from the remote it will abort the sync. The sync can take as long as it needs to; if it's particularly long, there will just be more of these "I'm still here" messages sent back and forth.
Thanks Bill, I will try that as well as have my customer get me some logs.
Here is the log where it happened. I. 2013-01-02 14:41:02. Begin synchronizing 'wam_pub' for MobiLink user 'neor' I. 2013-01-02 14:41:02. Log scan starting at offset 01836311702 I. 2013-01-02 14:41:02. Processing transaction logs from directory "d:weighstation11data" I. 2013-01-02 14:41:02. Processing transactions from active transaction log I. 2013-01-02 14:41:02. Hovering at end of active log I. 2013-01-02 14:41:02. Log scan ended at offset 01836319951 I. 2013-01-02 14:41:02. Connecting to MobiLink server at 'host=BOSASA1;port=2450' using 'TCPIP' I. 2013-01-02 14:41:03. Begin upload I. 2013-01-02 14:41:03. Uploading table operations I. 2013-01-02 14:41:03. Waiting for MobiLink to apply upload I. 2013-01-02 14:41:09. The user authentication value is 1000. I. 2013-01-02 14:41:09. COMMIT E. 2013-01-02 14:41:10. Error code from MobiLink server: -10244 E. 2013-01-02 14:41:10. Server error: The MobiLink server has encountered an error and the synchronization has been aborted E. 2013-01-02 14:41:10. Error code from MobiLink server: -10244 E. 2013-01-02 14:41:10. Server error: The MobiLink server has encountered an error and the synchronization has been aborted I. 2013-01-02 14:41:10. End synchronizing 'wam_pub' for MobiLink user 'neor' I. 2013-01-02 14:41:10. Disconnecting from MobiLink server I. 2013-01-02 14:41:11. Synchronization completed
I tried the timeout setting and I got this error. I. 2013-01-14 11:19:13. Connecting to MobiLink server at 'timeout=300' using 'TCPIP' E. 2013-01-14 11:19:14. Unable to connect a socket. Network Error: No connection could be made because the target machine actively refused it. (winsock error code: 10061). E. 2013-01-14 11:19:14. Unable to connect to MobiLink server.
There is an error at the MobiLink server side of things, so have a look at that log...
E. 2013-01-02 14:41:10. Error code from MobiLink server: -10244
FWIW you should ALWAYS start by looking at the MobiLink server log, and only (rarely) look at the dbmlsync log if the problem isn't solved.
Add the timeout setting to your existing adr value: host=BOSASA1;port=2450;timeout=600