views:

636

answers:

4

Hi guys, could some one help me out trying to understand how hard drive seeking works.

I have a small binary database file which read performance is absolutely essential. If I need to skip a few bytes in the file is it quicker to use seek() or to read() then discard the unwanted data.

If the average seek time of a hard drive is 10ms and the read speed is 300MB/s I calculated that it's quicker to read() than seek() with a value smaller than 3MB. Is true? Is there an overhead when performing a new seek, which reading an existing stream doesn't have?

Which do you think be a more suitable file structure for an index.

Entry1:Value:PointerIntoToData
Entry2:Value:PointerIntoToData
Entry3:Value:PointerIntoToData
Data, Data, Data

Or

Entry1:Value:Data
Entry2:Value:Data
Entry3:Value:Data

When reading an entry if the value is not correct it will be ignored. So when streaming the file is it quicker to: 1. when an entry is not required use seek() to skip over it 2. when a entry is not needed read it then discard the data 3. or the use first structure, when an entry is required seek() into a data repository at the end.

Entry is 4 bytes, value is 8 bytes & data is 12KB

Cheers

+3  A: 

How "absolutely essential" is seek access? Have you tested your application with a non-optimal solution yet? During that testing, did you benchmark to determine where the real bottlenecks are? If you haven't, you'll be surprised by the results.

Next, try different methods and compare the running times. Test under different system loads (ie, when the system is idle except for your application, and when it is busy).

Consider that your optimizations based on your current hard drive may become incorrect when a new, faster hard drive has different internal optimizations that throw your work out the window.

lacqui
No i haven't tested the program yet, it is still looking into different file structures. Every millisecond counts, I'm interested in the theoretical maximum. So do you think i need a working test environment to find out? The hard drive maybe under load from another process. Thanks
If, as you claim, every millisecond counts, try reading the database into memory. You say it's small (you quote 3M), so that should easily fit into your system memory.However, you still have to determine if the speed is a real or imagined requirement; ie WHY do you need the speed?
lacqui
Very seldom, and only with pathological configurations, have I seen hardware characteristics to be useful for optimizing software performance except in the very short term. And never until after thorough testing. Hardware changes too quickly move up the list of "things to try".
le dorfier
+1  A: 

A sequential read is always faster than one that requires a head seek (not position seek). Typical hard drive perf for sequential read is 50-60 MB/sec, seeking drops that down to a worst-case ~0.4 MB/sec. Once the drive heads are positioned, you essentially get the data in the cylinder for free. The file system cache takes advantage of that by pre-reading sectors from a cylinder.

However, you have no control over the placement of your data on disk cylinders. Nor can you guess the drive geometry. Note that throughput can get significantly worse over time when the volume gets fragmented. You'll need to look for perf by caching data in memory. At that point, you worry about locality of reference.

Hans Passant
What is the difference between a head and a position seek? Within a file you can't presume that the cylinders will always be adjacent (ext3)? The data is split into 32MB chunks which are read individually, but the volume of chunks means that they can't be cached in memory all at once.
@unknown, you are confused between the hard disk's seek mechanism and the system call. In practice, you will probably do better calling seek as you won't incur as much memory overhead from calling reads. But this depends on the specifics of your application.
BobbyShaftoe
@Bobby - Your right i am confused. So does the seek() system call not always move the head? Only when movement to another cylinder is required?
When a read stream is required to change cylinder, does it incur the same penalty of a seek() system call to that same data? Thanks
That's a yes and a yes.
Hans Passant
+4  A: 

All seek system call does is changing a position in file where the next read will be. It does not move the drive head. Drive heads move when data is read or written and you don't have direct control over what OS will do next.

Reading lots of data you aren't going to need has impact because all read data needs space in OS buffers and causes older data to be discarded. So using seek over big files will mess with filesystem cache less.


All I write beneath assumes you cannot fit whole database in memory. If you can, just do that. Read everything and try to append new and changed data at the end of file. Don't worry about wasted space, just do some compacting once in a while.


If your database is too big:

Data is read and written to physical drive in blocks (or pages). Similarly the basic unit of disk IO in your OS is page. If OS caches data from disk it's also in whole pages. So thinking whether you need to move forward few bytes using seek or read makes little sense. If you want to make it fast, you need to take into account how disk IO really works.

First, already mentioned by nobugz, locality of reference. If the data you use in each operation is located close together in a file, your OS will need to read or write less pages. On the other hand, if you spread your data, many pages will need to be read or written at once, which will always be slow.

As to data structure for index. Typically they are organized as B-trees. It's a data structure made especially for effective searching of large quantities of data stored in memory with paged reads and writes.

And both strategies for organizing data is used in practice. For example, MS SQL Server by default stores data the first way: data is stored separately and indices only contain data from indexed columns and physical addresses of data rows in files. But if you define clustered index then all data will be stored within this index. All other indexes will point to the data via clustered index key instead of physical address. The first way is simpler but the other may be much more effective if you often do scans of ranges of data based on clustered index.

Tomek Szpakowicz
A: 

You could always map the file into memory and then access it through pointers and such. That should usually make your accesses simpler and faster.