We have a setup with two DB instances, connected via sql-remote and FTP. In one table we had about 6million rows, and we did then delete ~5millions of them on the remote site in ~500000 rows blocks. (This took ~20 minutes) Now we got ~2700 messages in the flowing back from the remote to the consolidate. Our problem is, that it now takes about 50 minutes per FTP message to apply these deletes from the remote side. So applying all messages will take arround 96 days..... And of course all other "real" updates/inserts etc. are just queued up in more messages... We don't see a lot of io traffic from the database, but the database uses 100% cpu time of one (out of 8 available) cores. What could we do to speedup applying these messages ? We did already add the usefull indexes, but that did not improve performance. Is there a chance that this process is faster:
Actually we just can't affort to wait 96 days for the queue to clear up.... Or any other ideas how to speedup dbremote... (Using more than one CPU/core for example) André |
Lesson learnt from this question: The strong recommendation that all replicated tables should have a primary key is not only focussed on avoiding inconsistent data (s. doc note below) but also on avoiding miserable performance... I wasn't aware of this problem - though, fortunately, I have never used SQL Remote without fitting PKs.
I was aware of that recommendation... but as time goes sometimes we make mistakes... and have to pay for them :) @André: Agreed:) - But it's a huge benefit when others can learn from one's own mistakes, too, with the help of this site – at least that's my hope... |
What's your DBREMOTE command line? Particularly the -m setting? And the -g setting? While SQL Remote will turn all SQL operations in single row operations (leading to 5 mio. single DELETE statements here), the required time seems unreasonably high.
The statement finally acting in the database is:
DELETE FIRST FROM user.ModLog WHERE ModLog='100000JEUW' AND Users IS NULL AND LoginName='--system--' AND FullName='--system--' AND ActionDate='11:40:44 .74326 2010/10/13' AND TableName='Xdarinfo' AND FieldName='actperiod' AND Primarykey='1000000DPE' AND ActionType=1 AND OldValue IS NULL AND NewValue IS NULL AND Remark IS NULL AND OldString IS NULL AND NewString='1108-0910' AND OldDate IS NULL AND NewDate IS NULL AND OldNumeric IS NULL AND NewNumeric IS NULL AND Language IS NULL
The command line on the consolidate (where it will take 96days) is:
dbremote -c "uid=dba;pwd=XXXX;eng=sqlsvXX;dbn=XXXXXXXXX;CS=UTF-8" -l 1024000 -m 10M -ud -o /var/log/sqlremote.log /Data/SQLData/XXXX
On the remote (where I did the delete statements) it's -c "uid=dba;pwd=XXXXX;eng=XXXX;dbn=nlpdbhulst;CS=UTF-8" -l 1024000 -m 10M -x 20M -o C:Datalogsdbremote.log D:DataSQLDatadatabase\
It's sql anywhere 11.0.1.2427 under windows 2008 64Bit as the remote db and sql anywhere 11.0.1.2376 under debian linux 64bit
Perhaps the important part: That table has NO primary key defined, by error I did miss that and did try to first remove many rows before changing the definition... probably that's causing these problems...
@André: Yes, the missing PK will cause such awful DELETE statements. Usually SQL Remote uses PK-bound (and very efficient) single-row DELETE statements like "DELETE Table1 where pk = x". In your case, I guess you can't fix the problem "posthum" by adding the PK since the according rows have been logged in the transaction log as is and will replicate in this very inefficient way.
@André: IMHO, in case you can get exclusive hold of both databases, I would recommend to break replication and start from scratch with PK defined (and the according rows already deleted)... - Note, I usually don't recommend this, and maybe experts like Reg Domaratzki have better suggestions, but I guess the situation is rather stuck.
@Volker: Fortunally I have the nights where I can work exclusively on both databases. But in the replication queue I have a lot of other changes pending to be sent/received in both directions. So I can't actually see how to "merge" these changes. (The db itself is ~22GB in size, so every trial is rather expensive in terms of data transfer time over VPN)
@André: FWIW, in theory you can still use the remote's translog and DBTRANs it into a SQL script to catch the "good changes" and then apply those via DBISQL in the consolidated. I have done so succesfully in rare cases but with much smaller sets of changes (and it's errorprone, obviously). So, in theory... Time to wish youu good luck.