When formatting a 931GB disk drive for NTFS on Windows Vista 64-bit, the default allocation unit size seems to be 4K, but other choices are available: 8K, 16K, 32K, 64K.

If the drive's going to be dedicated to a single version 11.0.1 SQL Anywhere database file, does it make sense for the allocation unit size to match the database page size?

The title says "8K" but that is just today's choice... I'm asking the question in general, for SQL Anywhere page sizes 4K, 8K, 16K, 32K.

asked 21 Nov '09, 11:27

Breck%20Carter's gravatar image

Breck Carter
25.8k428592852
accept rate: 20%

edited 21 Nov '09, 11:42


I'm going to say "it depends".... but here are some comments that hopefully helps.

Let's consider the various cases:

1) database page size < NTFS sector size

If page size < sector size, then each time a page is read from the file, either the OS (or the disk controller) needs to read the full sector, extract the part of the sector that is desired - i.e. the page - and then give that page to the db server. So more data had to be read from disk than requested - maybe not that bad if using a spinning disk since the head had to fly over the whole track anyway.

But let's look at the write case. When writing a single page into a larger sector, the OS (or disk controller) needs to read the entire sector, copy in the new page contents into the buffered sector contents, and the write the whole sector back to the disk. This costs minimally two rotations of the the disk. Not good!

2) database page size > NTFS sector size

So what happens here? When a db page is read, the OS will read multiple sectors from the disk in order to get the entire page. This can almost always be satisfied by issuing a block read and the sectors would be sequential on the track so not a big issue. Depending on your disk layout, conceivably there could be the odd time that the logically sequential sectors may not be physically sequential on disk - e.g. overlaps two tracks - but this is likely rare for mosts disks. Fragmentation of your database file would also cause multiple seeks to be required in order to read the entire page and this would be costly. (Hint: keep your database file defragmented)

The write case is similar: When writing a single page, like the read case, multiple sectors would need to be written in order to get the entire page to disk. But one issue that could be of concern is that if power is lost during the write, it is possible that only some of the sectors make it to disk. We call this a "torn write". SQL Anywhere implements a number of techniques to be able to handle the torn write case and recover the database. (Note: some recent changes have been made in EBFs to specifically handle some torn write issues in the checkpoint log)

3) database page size == NTFS sector size

None of the above issues apply - read/write one db page == read/write one sector.

But that is not the end of the story. You should also consider the storage device's natural sector size when choosing an file system sector size. Up until recently, the disk sector size has been 512 bytes but starting in 2006 the industry has been moving to a sector size of 4096 in order to handle the ever-increasing sizes of disks. As such, it is likely wise to not choose a page size less than 4096 (similar to case 1 above). If you choose a sector size larger than 4096, similar issues to case 2 above may result.

But in conclusion the answer is going to be "it depends". Having said all of the above, you are going to have to base your decision on your database schema (hence what best db page size to use) and what file system overhead differences there are going to be (as Reimer has mentioned). For most applications I would suspect that it doesn't matter - there are going to be many other issues that are going to overwhelm any performance differences that would be seen between sector sizes. Performance testing using your database application is the only real way to know.

And a final note: The above is assuming that you are using a spinning disk. If you are using another type of storage - e,g, SDD or flash - then seeks times are not a concern but case 1 is even more of a bad thing due to the limited (albeit large) number of writes that current flash-type media can handle before wearing out.

permanent link

answered 25 Nov '09, 23:04

Mark%20Culp's gravatar image

Mark Culp
23.0k9130270
accept rate: 40%

Mark, thanks for that thorough explanation - I have never made a thought on these issues but they seem worthwhile to understand.

(26 Nov '09, 10:32) Volker Barth

I you're primarily using the drive for only a few large files, I would tend to a larger allocation unit. This means less MFT overhead and (hopefully) better I/O performance.

I don't think memory pages and disk clusters are that closely coupled. Provided memory is sufficient, some parts of the database will stay in the system cache.

But wait to hear what real experts will say (it's more kind of guess work on my side).

permanent link

answered 24 Nov '09, 10:47

Reimer%20Pods's gravatar image

Reimer Pods
4.2k334482
accept rate: 11%

BTW I did manage to get the 600GB database file down to two fragments but no fewer, on an empty drive, and it was spectacularly difficult to do even that well, at least for me. I have it on my to-blog list.

(24 Nov '09, 20:18) Breck Carter
Your answer
toggle preview

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text](http://url.com/ "title")
  • image?![alt text](/path/img.jpg "title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported

Question tags:

×242
×10
×7

question asked: 21 Nov '09, 11:27

question was seen: 7,974 times

last updated: 25 Nov '09, 23:04