On a daily basis, we have to delete large volumes of data. Our delete statements can take a long time to run. We've had some success using the MERGE statement but this is one table in particular where that won't work. I think the slow performance of the delete statements is because it constanly writes to the log while it does this. Is there a way to speed this up by not writing to the log (i.e. disabling it briefly and then turning it back on)? We can't use TRUNCATE TABLE because the delete only removes a subset of the total rows in the given table. Thanks, Tom

asked 14 Jul '10, 04:13

Tom%20Rolseth's gravatar image

Tom Rolseth
191878
accept rate: 0%

Make sure you're doing commits during the delete quite frequently - otherwise the rollback log will become huge as it has to store all deleted rows until the transaction is commited/rolled back. So I would try to split the delete into smaller chunks, e.g. by using "DELETE TOP n ... ORDER BY <yourpreferredorder>; COMMIT;" in a loop until all wanted rows are deleted.

(14 Jul '10, 07:19) Volker Barth

Volker - I tried this today using a simple test procedure. I didn't notice any improvement in performance -- it took just as long to run when not using a loop in ISQL. But you think this will cut back on log growth? That in itself would be good -- our log grows over 1GB every day until restarted during our backup process. Tom

(15 Jul '10, 22:06) Tom Rolseth

@Tom: The split-approach won't reduce the size of the translog as still the contents of each deleted row has to be stored in the translog. That's necessary for recovery purposes. It should reduce the size of the rollback log (which itself is part of the translog AFAIK) which contains all uncommitted changes and is dropped on rollback/commit. But the reduced size of the rollback log is only a temporary effect and as such should not impact the general translog file size.

(17 Jul '10, 10:53) Volker Barth
Replies hidden

@Tom: Whether the split approach might fasten your deletes significantly or not will depend on the actual query. Are there lots of locks held during the delete (or may the delete have to wait for other connections to commit)? - It might be helpful if you could present your query (with plan) here so that experts like Glenn can give advice...

(17 Jul '10, 10:57) Volker Barth

Did you find any answer? If so, please post it here. Thanks!

(28 Jul '10, 08:49) Breck Carter

Here is a standard delete for one of our client uploads:

DELETE FROM gt_payments WHERE fd_entity = '0048'; COMMIT;

The gt_payments table has just shy of 1 million rows. It has about 50,000 rows for client 0048. I just ran this statement in ISQL and it took 12 minutes to complete. The table has a pk of 3 columns and has 3 indexes -- none of which I created. Per Ron's point below, how can I determine if one or more of these indexes is unnecessary?

I created a plan. What's the best way to get it uploaded here? thanks, Tom

(29 Jul '10, 19:53) Tom Rolseth

Just to correct that older statement: AFAIK now, the rollback log is part of the system dbspace, not of the translog - to cite from the 12.0.1 docs:

Whenever modifications such as inserts, updates, or deletes are made to the database, entries are added to the rollback log, which is stored within the system dbspace. If many of these operations are performed before a commit is executed, the rollback log can become very large and may increase the size of the database.

(21 Aug '12, 11:35) Volker Barth
More comments hidden
showing 5 of 7 show all flat view

The transaction log is critical for recovery -- so allow me to restate your question to "How can I improve the performance of my DELETEs"

My first course of action would be what Volker stated in his comment, which is to loop through a series of smaller deletes -- and committing each time. I don't know why this works, but it does for me -- I assume it's because it creates smaller locks -- again as Volker suggests.

Another thing is that when you delete records, any related index has to be updated, and dependencies have to be checked. So make sure you don't have any unnecessary indexes defined and you don't have any unnecessary foreign key relationships defined either.

permanent link

answered 28 Jul '10, 11:52

Ron%20Hiner's gravatar image

Ron Hiner
880202427
accept rate: 9%

You also want to make sure you're not running any delete triggers. (Probably obvious, but maybe not for a everyone.)

(28 Jul '10, 15:59) carolstone

You could restart the engine in bulk modus. -b But for a productive environment this will not be a good idea.

Regards Thomas

permanent link

answered 14 Jul '10, 07:13

Thomas%20Duemesnil's gravatar image

Thomas Dueme...
2.7k293965
accept rate: 17%

Just as a further pointer (wthout knowing if the actual problem is still unsolved...):

Nick Elson [Sybase iAnywhere] has listed a lot of factors that may make a difference between the performance of a SELECT statement and an according DELETE statement based on the same SELECT.

So it's basically focussed on the question "Whis is a delete much slower than the according select?".

Don't know if this is the actual question here - but I'd like to quote the answer anyway. As it is in some kind of home-brewn table-format, I hope I have quoted it correctly:)

Deletes need to

  • modify table and index pages
  • all indexes must be modified
  • rollbacklog pages need to be allocated and filled with insert operations (can grow file too)
  • can require much more cache around check contraints & pubs
  • operations need to be logged
  • statistics might need to updated
  • will set off trigger actions, ri actions, cascade actions
  • require exclusive locks and will block more readily and be exposed to more contention
  • will cause disk reads and writes [more reads and writes are going to be more expensive than reads as well]
  • the need to visit all pages assoc. w\all columns, esp. extra's CLOBS, BLOBS, long types, ...
  • replication could even be playing a role here

Selects only need to

  • only accesses pages required
  • only parts of indexes accessed by the plan are visited
  • writes to the rollback log only if done in an updateable cursor
  • occasionally stats get updated
  • no triggers or checks
  • typically only shared locking
  • at low isolation levels there may be little or no contention
  • if the cache is warmed then select may have no i/o to do except when returning results to the client
  • selects may skip some of the extra stuff like BLOBs
  • replication is not a big factor

Not to mention the potential for system resource limitations leading to swapping to the PagingFile, I/O contention with other processes, ... etc

Much depends upon the state of the server, the specific schema, the data distribuition, the state of the table and indexes in the database, and other activity on other connections and in applications.

Source: news://forums.sybase.com/sybase.public.sqlanywhere.general, thread "Slow deletes vs selects" starting 2010-03-31.

permanent link

answered 28 Jul '10, 15:43

Volker%20Barth's gravatar image

Volker Barth
40.1k361549819
accept rate: 34%

Your answer
toggle preview

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text](http://url.com/ "title")
  • image?![alt text](/path/img.jpg "title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported

Question tags:

×59
×19
×13
×4

question asked: 14 Jul '10, 04:13

question was seen: 7,043 times

last updated: 21 Aug '12, 11:35