We have a setup with two DB instances, connected via sql-remote and FTP.

In one table we had about 6million rows, and we did then delete ~5millions of them on the remote site in ~500000 rows blocks. (This took ~20 minutes)

Now we got ~2700 messages in the flowing back from the remote to the consolidate.

Our problem is, that it now takes about 50 minutes per FTP message to apply these deletes from the remote side. So applying all messages will take arround 96 days..... And of course all other "real" updates/inserts etc. are just queued up in more messages...

We don't see a lot of io traffic from the database, but the database uses 100% cpu time of one (out of 8 available) cores.

What could we do to speedup applying these messages ? We did already add the usefull indexes, but that did not improve performance.

Is there a chance that this process is faster:

  • When we stop dbremote
  • Unload the still needed rows from the table
  • The delete all entries in that table
  • and finally restart dbremote
  • and after all sql-remote messages have been applied we reload the rows we still need

Actually we just can't affort to wait 96 days for the queue to clear up....

Or any other ideas how to speedup dbremote... (Using more than one CPU/core for example)

André

asked 09 Dec '10, 13:24

ASchild's gravatar image

ASchild
777222740
accept rate: 14%

What's your DBREMOTE command line? Particularly the -m setting? And the -g setting? While SQL Remote will turn all SQL operations in single row operations (leading to 5 mio. single DELETE statements here), the required time seems unreasonably high.

(09 Dec '10, 13:56) Volker Barth

The statement finally acting in the database is:

DELETE FIRST FROM user.ModLog WHERE ModLog='100000JEUW' AND Users IS NULL AND LoginName='--system--' AND FullName='--system--' AND ActionDate='11:40:44 .74326 2010/10/13' AND TableName='Xdarinfo' AND FieldName='actperiod' AND Primarykey='1000000DPE' AND ActionType=1 AND OldValue IS NULL AND NewValue IS NULL AND Remark IS NULL AND OldString IS NULL AND NewString='1108-0910' AND OldDate IS NULL AND NewDate IS NULL AND OldNumeric IS NULL AND NewNumeric IS NULL AND Language IS NULL

(09 Dec '10, 16:15) ASchild

The command line on the consolidate (where it will take 96days) is:

dbremote -c "uid=dba;pwd=XXXX;eng=sqlsvXX;dbn=XXXXXXXXX;CS=UTF-8" -l 1024000 -m 10M -ud -o /var/log/sqlremote.log /Data/SQLData/XXXX

(09 Dec '10, 16:16) ASchild

On the remote (where I did the delete statements) it's -c "uid=dba;pwd=XXXXX;eng=XXXX;dbn=nlpdbhulst;CS=UTF-8" -l 1024000 -m 10M -x 20M -o C:Datalogsdbremote.log D:DataSQLDatadatabase\

(09 Dec '10, 16:18) ASchild

It's sql anywhere 11.0.1.2427 under windows 2008 64Bit as the remote db and sql anywhere 11.0.1.2376 under debian linux 64bit

(09 Dec '10, 16:21) ASchild

Perhaps the important part: That table has NO primary key defined, by error I did miss that and did try to first remove many rows before changing the definition... probably that's causing these problems...

(09 Dec '10, 16:28) ASchild
1

@André: Yes, the missing PK will cause such awful DELETE statements. Usually SQL Remote uses PK-bound (and very efficient) single-row DELETE statements like "DELETE Table1 where pk = x". In your case, I guess you can't fix the problem "posthum" by adding the PK since the according rows have been logged in the transaction log as is and will replicate in this very inefficient way.

(09 Dec '10, 16:59) Volker Barth
Comment Text Removed

@André: IMHO, in case you can get exclusive hold of both databases, I would recommend to break replication and start from scratch with PK defined (and the according rows already deleted)... - Note, I usually don't recommend this, and maybe experts like Reg Domaratzki have better suggestions, but I guess the situation is rather stuck.

(09 Dec '10, 17:03) Volker Barth

@Volker: Fortunally I have the nights where I can work exclusively on both databases. But in the replication queue I have a lot of other changes pending to be sent/received in both directions. So I can't actually see how to "merge" these changes. (The db itself is ~22GB in size, so every trial is rather expensive in terms of data transfer time over VPN)

(09 Dec '10, 17:10) ASchild

@André: FWIW, in theory you can still use the remote's translog and DBTRANs it into a SQL script to catch the "good changes" and then apply those via DBISQL in the consolidated. I have done so succesfully in rare cases but with much smaller sets of changes (and it's errorprone, obviously). So, in theory... Time to wish youu good luck.

(09 Dec '10, 17:32) Volker Barth
More comments hidden
showing 5 of 10 show all flat view

Lesson learnt from this question:

The strong recommendation that all replicated tables should have a primary key is not only focussed on avoiding inconsistent data (s. doc note below) but also on avoiding miserable performance...

I wasn't aware of this problem - though, fortunately, I have never used SQL Remote without fitting PKs.

A table without a primary key or unique constraint refers to all columns in the WHERE clause of replicated updates

When two remote databases make separate updates to the same row and replicate the changes to the consolidated database, the first changes to arrive on the consolidated database are applied; changes from the second database are not applied.

As a result, databases become inconsistent. All replicated tables should have a primary key or a unique constraint and the columns in the constraint should never be updated.

permanent link

answered 10 Dec '10, 08:38

Volker%20Barth's gravatar image

Volker Barth
30.3k301453660
accept rate: 32%

I was aware of that recommendation...

but as time goes sometimes we make mistakes... and have to pay for them :)

(10 Dec '10, 09:37) ASchild

@André: Agreed:) - But it's a huge benefit when others can learn from one's own mistakes, too, with the help of this site – at least that's my hope...

(10 Dec '10, 11:15) Volker Barth

I did now the following:

  • I did remove the table in question from the replication on both sides
  • Then I did delete the 5mio rows on the consolidate manually
  • Then I did unload the remain 1mio rows on both sides
  • Changed the table definition to include a primary key
  • Readded the table to the replication
  • The I did restart dbremote, and it did apply 2-3 messages per second
  • After about one hour all the 2700 replication messages had been processed

So the databases are now in sync and replicating again. What is still left to do, is to deduplicate the unloaded data from the table and then reload it in the table.

Thanks Volker for your help.

André

permanent link

answered 10 Dec '10, 06:36

ASchild's gravatar image

ASchild
777222740
accept rate: 14%

Comment Text Removed

That's a remarkable speed-up - glad you could omit the re-extract.

(10 Dec '10, 08:41) Volker Barth

Yep, of course scanning a empty table for all fields matching the values from the remotely deleted record is much faster, than doing it in a table with 5-6mio rows.

(10 Dec '10, 09:38) ASchild
Your answer
toggle preview

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text](http://url.com/ "title")
  • image?![alt text](/path/img.jpg "title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported

Question tags:

×242
×77

question asked: 09 Dec '10, 13:24

question was seen: 801 times

last updated: 10 Dec '10, 08:38