How is performance impacted when you have a table that, lets say could have over 1M rows, and there are 3-5 columns that are consistently NULL.

Is it bad design to have data structures in place that support this? Should we look at normalizing it a little more?

Thanks! -Calvin

P.S. Love this place.

asked 15 Jan '10, 03:40

Calvin%20Allen's gravatar image

Calvin Allen
1.5k232638
accept rate: 25%

edited 15 Jan '10, 04:14


Glad you like this place! It's only going to get better, when more people show up.

I'm not sure what you are asking in your second sentence, in particular I'm not sure what NULL has to do with normalization.

Tackling the first question...

I don't think NULL values cause performance problems per se, at least not that you'd notice. I know that historically, especially with those icky legacy DBMS products like [expletive deleted] and [bodily harm], folks used to worry a lot about fixed-versus-variable length columns, physical ordering of columns in rows, [excrement] like that.

What CAN hurt performance somewhat is inserting rows with all nulls, all zeroes, all empty character strings, and then later updating rows to fill in the values: instant row splits across multiple pages, instant fragmentation of your physical database storage. At least, that was true historically with SQL Anywhere... I am not absolutely sure V11 hasn't eased that problem, I will let Mr. Paulley et al fill in the blanks here.

I haven't run any INSERT/UPDATE/rowsplit tests lately, and when I did it wasn't easy to demonstrate the problem, you had to work at it. And it wasn't directly caused by NULLs per se, but by increasing value sizes over time.

I guess I'm saying, there are lots of reasons to hate NULLs but performance isn't one of them.

permanent link

answered 15 Jan '10, 08:06

Breck%20Carter's gravatar image

Breck Carter
25.7k427587847
accept rate: 20%

I guess the question is: Is is good design to put similar things in only one table when some of them can't have values for a few fields whereas others can, or should one put the addional fields in a separate child table, i.e. the common question of OOP-Types/Subtypes and how to map them to a RDM. - Well, there's a whole ORM industry as an answer:)

(15 Jan '10, 09:12) Volker Barth
3

Breck is correct about causing row splits due to updates of columns that were initially NULL. Some of this can be mitigated through the use of new INLINE column specification modifier introduced in Version 10, but the behaviour remains.

(15 Jan '10, 13:22) Glenn Paulley

Excellent!

The main reason is for asking is we have had some debate on whether or not to place a column on a table, if say, 75% of the time, it would remain NULL.

(15 Jan '10, 14:34) Calvin Allen

Gee, I didn't know you could upvote comments... cool! I like this SQLA thing! :)

(15 Jan '10, 23:03) Breck Carter

@Volker: Are you saying nulls make ORMs harder? Are you trying to make me love nulls? <grad>

(15 Jan '10, 23:05) Breck Carter

@Breck: Well, I can't upvote comments (or don't know how to) though I have noticed that this can be done. May it be bound to the amount of score or the part one plays in the current question/answer?

(16 Jan '10, 09:48) Volker Barth

Ah, found it out. Obviously I can upvote comments on my own questions/answers. Nice feature, really:)

(16 Jan '10, 09:50) Volker Barth

@Breck: I can't say much about ORMs, as I haven't used one so far. Guess I share your point of view of prefering SQL over some abstraction on that, as long as the data model isn't too complex. W.r.t. nulls, I think they are useful in certain circumstances, and for that, I prefer them. But I agree, one shouldn't use them too frequently...if a column mustn't accept nulls, it should be NOT NULL. - Well, this answer might be more serious than your questions:)

(16 Jan '10, 09:56) Volker Barth
More comments hidden
showing 5 of 8 show all flat view

In addition to Breck's and Martin's responses, Nulls are quite ignorable w.r.t. to storage size:

From the SA 11 docs:

Column space utilization for NULL values is 1 bit per column and space is allocated in multiples of 8 bits. The NULL bit usage is fixed based on the number of columns in the table that allow NULL values.

permanent link

answered 15 Jan '10, 09:18

Volker%20Barth's gravatar image

Volker Barth
30.3k301454660
accept rate: 32%

Your answer
toggle preview

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text](http://url.com/ "title")
  • image?![alt text](/path/img.jpg "title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported

Question tags:

×242
×113

question asked: 15 Jan '10, 03:40

question was seen: 1,048 times

last updated: 15 Jan '10, 20:03