Hi, SQL Anywhere 16.0.0.2546, Windows 2012 R2 We have an automated data export for our client (just a bunch of SELECT queries) which output data every evening at 22:00. However, sometimes, without any visible pattern, we've noticed that there are few trades missing from the export, although the trades are very much present in DB on a later inspection. We even created another, identical event, the only difference being that it runs at 07:00 next morning, which always works perfectly and includes what the first export does not. Last night clients sent us a database backup taken at 00:30 in the morning, saying that "trade 12345 did not get exported". This trade was not present in the backup, although it was most definitely entered by users the afternoon before. However, they sent us another backup taken 10:00 this morning, and that trade was very much there. We are not just talking about 1 row and 1 table, but about 20 different rows across 10-15 different tables. Initially I thought that the wrong backup was taken, which had no activity of the day 2017-10-19 whatsoever, but in one table I found 40-50 rows that were most definitely populated on 2017-10-19 evening. So it is very much the backup of that night. So, to summarise:
I have never seen any behaviour like this before. Tables are not temporary or compressed or encrypted. The database is the only one on site, it is not Mobilink-ed or Ultralite-d. Just to underline - there is no visible pattern, it only happened 5 or 6 or 7 times in 12 months or so, during which time 5000-6000 exports were done 100% successfully. Any clues and/or interest warmly appreciated! |
Could timezones be causing a problem? Eg, os timezone, database server timezone, script timezone etc. I really don't see how it could. It's one of the first things we looked at, but the timing of the event is completely outside of any chance for that.
(27 Oct '17, 04:14)
PcrMember
|
Have you looked at the transaction log to determine the time that the transaction for trade 12345 was committed? E.g. Perhaps the transaction is not being committed until several hours after it was inserted into the table? (a long shot guess, but worth confirming!)
At present, the Transaction Log is being truncated on automatic backup each evening - I have sent instructions that this should be disabled today, and if it happens again, I will translate the tran log and investigate this first thing.
And no, not a long shot at all - one of the first things I've thought was "I wish we didn't truncate the log", as I was so gobsmacked at what I was seeing.
But I do have to say that I'm highly sceptical though. Even the smallest transactions in our application call SQLCA.Of_Commit() or direct commits all the time - a lot of things went into that DB on 2017-10-19, most bizarrely one DB table kept the data entered around 21:50 in the evening, but none of the others did, and all work was finished around 17:00 or so.
Also, even if transactions are not committed, they would surely be present on SELECTs, right? Hundreds of times I inserted 2000 rows, did a few analytical SELECTs before any of it was committed, and then did a rollback. All the data was there until a rollback was called. And if something did call rollback to cancel at least 20 rows in about 15 tables, they would not be present 8 hours later, surely?
Thanks
Whether or not a SELECT (from another connection) sees a non-committed transaction depends on the isolation level of the connection. If isolation level 0 is used then it will see uncommitted transactions. See the documentation on isolation levels for more information.
Note: the connection that does the insert/update/delete always sees the effect of the action immediately independent of its commit state.
To be honest, I forgot all about this - thanks. However, I just checked, the isolation level is set to 0.
My second thought would be to verify that the automatic backup(s) are running correctly. E.g. Assuming you are backing up to the same location each night, could it be that periodically the backup file (or location) is not writable - e.g. network share not reachable or backup file is locked by another process? - and therefore what you think is the (new) backup is really just the previous copy of the backup?
Could it be an index is corrupted or another kind of corruption or bad RAM? Does the database validate successfully?
Very wild guesses...
Yes, it does, I'm afraid. The server has been double-checked, everything is fine.
No, I have checked the timestamps on the backup files several times - all are completely consistent to make me believe that the files got backed up ok and on schedule.
In the meantime, this mystery gets stranger by the minute. So, as I have explained above,
22:00 - data exported, trade wasn't exported
00:30 - automatic backup "A" done - trade not present in it
07:00 - secondary identical data export, on the same database, trade PRESENT
10:30 - manual backup "B" done - trade PRESENT
BUT - there is another redundant backup, done at 21:00, so one hour before the first automated data export was done - we have just received the files for it, ALL DATA IS PRESENT THERE. Trade, and all the corresponding data IS THERE. Incredible!
So to revise the schedule to the full:
21:00 - redundant automatic backup of Live "000" done - TRADE PRESENT IN BACKUP "000"
22:00 - data exported from Live, TRADE NOT PRESENT IN EXPORT
00:30 - automatic backup of Live "A" done - TRADE NOT PRESENT IN BACKUP A
07:00 - secondary identical data export, on the same Live database, TRADE PRESENT IN EXPORT
10:30 - manual backup of Live "B" done - TRADE PRESENT IN BACKUP
I'm going slightly mad.
Can you translate the logs via DBTRAN and check how the ooperations on the according row are contained (or not)?
That will certainly be the plan if it happens yet again - I'm afraid one automated backup was truncing the LOG file every night, which has now been switched off.
Is the LOG truncated and removed, or stored somewhere for historical purposes?
I can't say, I don't have access to the production server or the backups, I'm afraid :(
Show us the command that performs the export; does it specify a date/time range in the where clause?
Verify that client system time is the same as the database system time, and if they are different, check that the difference doesn't interfere with the export.
I am 99.9999% sure that SQL Anywhere doesn't "hold back" anything to be "released later", and that the problem is on the client side... the remaining 0.00001% is reserved for divine intervention, magic and the occult :)
Thanks Breck, always nice to see a reply from you :)
I'm afraid I can't really show you the command, it is a scheduled event calling a SP which performs a big but not too complex SELECT statement from 1 big view and 3 or 4 joined tables. Yes, a "today()" parameter is always passed to the stored procedure to filter the data by specific dates, but this is the first thing we eliminated. System date is definitely the same as DB date, and they are straight DATEs, not DATETIMEs, so time aspect does not even enter the equation here. The event is kicked off around 21:50, and only lasts altogether for 3-4 minutes, so it is done well before any 00:00 time which could mess up the today() with today()+/-1.
To be quite honest, I am very close to tell the clients that I have to assume that somehow, the event got called on some backup copy of DB of Today-1. On investigation of two backups, I am simply not finding another explanation. This backup that we received had Database Users' "Last login timestamps" of 2017-10-18, not 2017-10-19! It is beyond any doubt that system was used on 19th, the "other" backup shows 19th as "Last login time". What is just beyond me is how could have backup event been called on a copy of today()-1. Yes, there are some backups of DB and LOG files present in other subfolders on the machine, but none of them have a service running on them, or are accessed by any application, or ODBC.
And those backups are also not started for validation purposes (which might trigger an event to run there)?
Definitely not. One backup is sitting quietly in its own folder being overwritten on a weekly basis, and the other is there just to copy the files to a Test server.
You are not using snapshot isolation within the event, right? Because a snapshot transaction would be the only thing to ignore later modifications without blocking them, AFAIK...
But apparently that would require the event to start before the according row is inserted/modified, and I guess that is not the case here...