Hey everyone, We're currently struggling with some issues in our productive product(s) and a bit of help would be much appreciated. The symptom Seems quite simple at first : the synchronization starts successfully, the upload occurs, the download is prepared, it downloads and put rows into the transaction log and finally we're getting a
on one table (must say, this table can get pretty big after a few days). The error occurs on several devices that until then successfully synchronized. The context We're using :
The product runs big amount of data and in our case, the device has come to a point where it needs to synchronize 100K+ rows on the table on which the error is generated. We have a super strong Wifi connection. Thus the problem is not coming from a client losing connection. The logs state that between the moment we start syncing a table and the moment the error pops is quite random. I've seen values as ~60 minutes, ~35 minutes and... 4 minutes. And the question
I thank you in advance for your support, Best regards. Edit (2016-08-26): here's an extract of the logs
Near these logs, we can find :
|
We have an ml-session-id header on every HTTP request that the server uses to associate the different HTTP requests of a sync together. A 404 probably means that the server received an HTTP request with an unknown session id. There're two likely causes of this:
If you're going to be doing large downloads, you also might want to look into the restartable downloads feature. With that, the server will keep failed download around and they can be resumed even after a disconnect. The download information isn't shared across servers in a farm, so if you turn on resumable downloads with more than one server, you'll have to make sure that all requests from a remote go to the same server. Hey Bill, I've looked into the server logs and have spotted several "[-10117] Stream Error: Unable to continue unknown HTTP session". Is there any difference between the -10117 and the -12079 ?
(24 Aug '16, 10:27)
yan
Replies hidden
If you are seeing this error message then the MobiLink connection to Oracle may be blocked by a lock held by some other connection, perhaps another MobiLink connection, perhaps not. (I think the word "clocking" should be "blocking" in the "Probable cause" below) Connection ID '%1' is currently blocked by connection ID '%2' for %3 seconds on %4 Error code 10117 Error constant CONN_BLOCKED_ON Parameter 1 A MobiLink server database connection ID. Parameter 2 Another database connection ID. Parameter 3 Time in seconds. Parameter 4 Table name or database operation that is currently blocked. Probable cause The MobiLink server will detect any its database operations that take longer than a given time and reports the connection ID that is currently clocking the MobiLink server connections. To avoid this problem, try to reduce the transaction open time.
(24 Aug '16, 11:33)
Breck Carter
1
FWIW the MobiLink server log is the very first place to look for diagnostic information about MobiLink errors, because that's where you will find what you're looking for 99.99% of the time. The other 0.01% of the time, helpful information will be found elsewhere; e.g., on the client side, Oracle database server, etcetera. Of course, I am exaggerating. The real number is only 99.9% :)
(24 Aug '16, 11:37)
Breck Carter
I don't see the Error code 10117 message in the mobilink server logs but the -10117. Which makes me wondering because there is a high probability that the central database might take long to execute the query. Is there any link between the two error codes ? :o
(26 Aug '16, 03:36)
yan
Well, IMHO, both are totally different messages:
As you are seeing a -10117 error but do not get the 10117 warning, I guess a blocked connection is not necessarily an issue here... FWIW, I don't find anything about the -12079 error mentioned by Bill in the docs. Do you notice a -10279 error ("Connection was dropped due to lack of network activity") instead?
(26 Aug '16, 04:17)
Volker Barth
Yes, totally ! And it is reported on the same synchronization id as the one who reported the initial problem. Should I go with Bill's answer then ?
(26 Aug '16, 04:38)
yan
Sorry, I just tried to clear up the error codes... - can't tell on the ML problem itself. However, Bill – who is the expert – has raised the question whether you are using a ML server farm. If not, then in my limited understanding, point 2 from his answer would not apply here. So do you use several ML servers? For further diagnosis, it might be helpful if you could include a ML server log snippet showing these errors and the lines before/after...
(26 Aug '16, 04:47)
Volker Barth
We have two ML servers running. The Reverse Proxy is set to keep HTTP session sticky and finally we don't use the restartable feature. We tried though but have decided to get rid of it. Finally, I've added some ML server logs in the initial post. :-)
(26 Aug '16, 05:11)
yan
> Is there any link between the two error codes ? :o Other than my stupidity, apparently not :)
(27 Aug '16, 00:12)
Breck Carter
|
If it's an HTTP 404, that typically means the server was found but not the page. In the case of MobiLink, I think it simply means the URL for the MobiLink server was not found.
I detailed a bit the question. This seems not to be the case as the sync starts successfully, does the upload and downloads a few tables before crashing.