Will inserting null dates and updating them later cause fragmentation?

We have a transaction table like this:

 table txn (
    date_created    timestamp not null, 
    date_confirmed  timestamp null,
    ...
 )

Records are inserted with date_confirmed=null, using null to mean it has not been confirmed.

The majority of transactions are confirmed within a few minutes and date_confirmed is updated.

If disk space is not allocated for null values on insertion, will this lead to fragmentation? Is it better to insert a 'date-that-means-unconfirmed' instead?

best-practices fragmentation null

asked 28 Apr '11, 21:29

Mark
256●5●9●17
accept rate: 20%

edited 15 Sep '11, 08:27

Mark Culp
25.0k●10●142●298

flat view

2 Answers:

active answers oldest answers newest answers popular answers

It depends (doesn't it always :-), but I would say that if most rows will eventually have the nullable column date_confirmed column (and any other nullable columns) set to a non-null value then it is likely best to insert the row into the table with a non-null value.

Inserting rows with null values and the changing the null values to non-null will cause the row to grow and if there is insufficient free space on the page where the row is stored then the row will be split (or moved and a stub left behind in its original spot) and thus fragmenting the row. Exactly how much fragmentation occurs will depend on:

the page size or database
the size of your row (before and after null values are expanded)
the rate at which the rows are inserted relative to the time between initial insertion and the time that the null value(s) are made non-null

I.e. it really depends on how may splits are going to occur per page.

For example, if it is always the case that when you insert a row with a null value and you then set to a non-null value before any other row is inserted then it will likely be the case that doing this sequence will lead to at most one split row per page (i.e. the last row inserted on the page) and therefore this may not be that big of a deal ... but how "big" of a deal it is will depend on the number of rows per page (which depends on the size of the page and the size of a typical row).

However if you are inserting rows at a rapid rate (relative to the setting of the columns to non-null values) then it could be the case that each page is filled to its limit with rows containing nulls ... so when you set the nullable columns to non-null values there may not be sufficient space on the page to hold the now-larger-row and hence this could lead to every row on the page being split. This would be bad.

So, net result is that if you know that the column is going to have a non-null value then save yourself the grief and make the column non-nullable and initialize it to a out-of-band "date-that-means-unconfirmed" value. Declaring the column as "not null" has the bonus that the optimizer may be able to use this information to possibly choose more efficient plans when the column is used as a predicate in your queries.

permanent link

answered 28 Apr '11, 21:58

Mark Culp
25.0k●10●142●298
accept rate: 41%

edited 28 Apr '11, 22:07

Mark, thanks very much for your detailed answer. It will influence my table designs in future. In this case, subsequent inserts before an update are not unlikely. According to the alter table documentation we can still add a PCTFREE clause, so we'll do some estimates and see if we need to change the default.

(28 Apr '11, 22:52) Mark

From a performance point of view, I fully agree with Mark's answer.

However, if performance is not that of a problem (or can be solved with a proper PCTFREE setting), IMHO a null value does express the non-existence of a value much better (and is easier to understand) than an application-specific "special" value.

(29 Apr '11, 09:40) Volker Barth

flat view

2	see also the question: will updating a null timestamp rewrite the row? permanent link answered 29 Apr '11, 03:33 Martin 9.1k●131●170●257 accept rate: 14% flat view

Your answer

toggle preview

community wiki:

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

Markdown Basics

*italic* or _italic_
**bold** or __bold__
link:[text](http://url.com/ "title")
image?![alt text](/path/img.jpg "title")
numbered list: 1. Foo 2. Bar
to add a line break simply add two spaces to where you would like the new line to be.
basic HTML tags are also supported

learn more about Markdown

Question tags:

best-practices ×23
null ×20
fragmentation ×9

question asked: 28 Apr '11, 21:29

question was seen: 3,888 times

last updated: 15 Sep '11, 08:27

SAP SQL Anywhere

Will inserting null dates and updating them later cause fragmentation?

Follow this question

Related questions