we have a Relay Server/MobilLink environment and peridically the MobiLink Server stops responding (we will see the database connection has dropped, or something). It would seem that restarting that one service should solve the problem, however I find I have to restart ML Server, then restart RSOE, the restart ML Server again to get everyone to talk again.
it is hard to find a dead chicken in the server room to wave over the keyboard, so I am hoping someone can recommend a more reliable restart procedure.
asked 23 Feb '10, 22:37
What version of the RSOE and MobiLink server are you running? I would expect to see the following:
1) The MobiLink server is shutdown, the RSOE determines the ML server is shutdown 2) The MobiLink server is restarted, the RSOE detects the ML server is active again.
You should not have to restart the RSOE if the MobiLink server is shutdown then restarted. There was a bug in the RSOE that would improperly detect the MobiLink server had gone away and not reconnect successfully. I suggest upgrading the Relay Server and RSOE to the latest EBF. These could be the potential issues you are running into:
================(Build #2346 - Engineering Case #605874)================
When the Relay Server Outbound Enabler (RSOE) timed out an up channel connection, the RSOE would have recovered the connection, but it may have resulted in an invalid opcode being received in error from the Relay Server, and then cause the RSOE to disconnect both the up and down channels and restart. This is now fixed so that the RSOE will handle the reconnect properly without causing a more substaintial restart before restoring the service.
================(Build #2170 - Engineering Case #563592)================
When run on Windows systems, both the MobiLink server and the Relay Server Outbound Enabler (RSOE) could have held onto sockets for longer than necessary. This would have caused both to use up sockets faster than necessary, possibly exhausting system socket limits. With the RSOE, needless timeouts could also have occurred. This behaviour was particularly evident with non-persistent HTTP/HTTPS connections, and appeared to be very much OS and machine dependent. This has been fixed.
As for the following:
E. 2010-02-26 07:54:20. <1> [-10030] A network read failed. Unable to read data from the remote client E. 2010-02-26 07:54:20. <1> [-10091] This connection will be abandoned due to previous errors
If you are running the MobiLink server with -vf or -v+ then it is reporting first read errors. This network failure is a result of the RSOE pinging the MobiLink server to determine if it available. You can suppress this error by removing -v+ or -vf if you have it set.
answered 04 Mar '10, 17:29