The useful Schema Analysis function in SQL Central gives a useful set of recommendations. One of these identifies sub-optimal column ordering saying:

Ordering columns so that wider columns precede narrow columns can negatively impact response times. Narrow columns should be defined in the table declaration before wider columns, unless they are accessed infrequently. Consider re-ordering table columns so that the wider columns are defined at the end. Wide columns are columns greater than 15 in size, or LONG data types (for example, LONG VARCHAR), or columns defined as XML.

This inevitably happens in a database whose structure evolves over time, but unless I'm missing something there's no easy (or even feasible) way of addressing this in a large schema with hundreds of tables / thousands of columns, foreign keys etc.

Since for other purposes, one really doesn't care about column order, what would seem to be a really useful feature, would be an option on dbunload / UNLOAD to optimize the column order automatically during the unload - presumably the primary key would come first, then columns involved in constraints, and then everything else in ascending size.

While an automatically calculated order might not be entirely optimal, it is likely to be better than a more or less random one!

Better ideas eagerly requested!

asked 23 Jun, 09:19

Justin%20Willey's gravatar image

Justin Willey
7.3k128165235
accept rate: 20%

edited 23 Jun, 09:21

1

presumably the primary key would come first, then columns involved in constraints, and then everything else in ascending size.

I was about to second that suggestion.

However, while I would also suggest to put PK columns first, and possibly go on with UNIQUE KEY columns and NON NULL FK columns and then NULLable FK columns, strictly sorting all "normal columns" by size (what size? declared vs. real (average) size?) would seem rather "unsorted" to me. I guess then I'll prefer to stick with the historical order. :)

In other words: I'd either re-sort the order myself (in a "logical" order) or let it be as-is. That being said, I guess I have rarely designed tables where the PK was not the first column(s). And if so, then I would certainly try to correct that as soon as possible by manually re-creating the table (via INSERT WITH AUTO NAME SELECT...).

(23 Jun, 10:21) Volker Barth
Replies hidden

Certainly I can't think of an example where the primary key isn't the first column, but certainly some of the other stuff gets added a lot later!

(23 Jun, 11:35) Justin Willey

IMO this exhortation is an incorrect description of third-order effect on runtime performance. It is not discussing first-order set-level retrieval performance, but the scalar analysis of columns in a single row:

The order of the columns in a table affects performance. Columns in a row are accessed sequentially in the order of their creation. For example, to access columns at the end of a row, the database server traverses the columns that appear earlier in the row. You should order your columns so that narrow and/or frequently accessed columns are placed before seldom accessed and/or wider columns in the table. Wide columns are columns greater than 15 bytes in size, or LONG data types (for example, LONG VARCHAR), or columns defined as XML. Primary key columns are always stored at the beginning of row.

In other words, the engine has to "step over" column values 1 to 10 in each row in order to reach column value 11. If you are really worried about these calculations, you shouldn't worry that 55+1 is going to be faster than 55+100 after you move all the short values to the front... because it isn't... addition performance is not value dependent.

Note that it doesn't have to step-over-byte-by-byte, in case that's what you were thinking :)

Instead, you should move all the MOST-FREQUENTLY-REFERENCED columns to the front, so the engine has to "step over" fewer columns (fewer calculations)... but I doubt that you will see any improvement, EVEN IF YOU CAN identify the most-frequently-referenced columns (go ahead, try :)

The mention about LONG columns is particularly misleading since most of their data isn't even stored in the row.

Suggestion: Try a benchmark test, betcha won't be able to measure the effect :).

permanent link

answered 23 Jun, 10:43

Breck%20Carter's gravatar image

Breck Carter
30.5k493674986
accept rate: 20%

converted 24 Jun, 14:54

Justin%20Willey's gravatar image

Justin Willey
7.3k128165235

Note that it doesn't have to step-over-byte-by-byte, in case that's what you were thinking :)

I think that is what I was thinking it could be :}

(23 Jun, 11:27) Justin Willey
Your answer
toggle preview

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text](http://url.com/ "title")
  • image?![alt text](/path/img.jpg "title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported

Question tags:

×261
×168
×12

question asked: 23 Jun, 09:19

question was seen: 58 times

last updated: 23 Jun, 11:35