views:

344

answers:

6

I need to store large amounts of data on-disk in approximately 1k blocks. I will be accessing these objects in a way that is hard to predict, but where patterns probably exist.

Is there an algorithm or heuristic I can use that will rearrange the objects on disk based on my access patterns to try to maximize sequential access, and thus minimize disk seek time?

A: 

The most simple way to solve this is to use an OS which solves that for you under the hood, like Linux. Give it enough RAM to hold 10% of the objects in RAM and it will try to keep as many of them in the cache as possible reducing the load time to 0. The recent server versions of Windows might work, too (some of them didn't for me, that's why I'm mentioning this).

If this is a no go, try this algorithm:

  • Create a very big file on the harddisk. It is very important that you write this in one go so the OS will allocate a continuous space on disk.

  • Write all your objects into that file. Make sure that each object is the same size (or give each the same space in the file and note the length in the first few bytes of of each chunk). Use an empty harddisk or a disk which has just been defragmented.

  • In a data structure, keep the offsets of each data chunk and how often it is accessed. When it is accessed very often, swap its position in the file with a chunk that is closer to the start of the file and which has a lesser access count.

  • [EDIT] Access this file with the memory-mapped API of your OS to allow the OS to effectively cache the most used parts to get best performance until you can optimize the file layout next time.

Over time, heavily accessed chunks will bubble to the top. Note that you can collect the access patterns over some time, analyze them and do the reorder over night when there is little load on your machine. Or you can do the reorder on a completely different machine and swap the file (and the offset table) when that's done.

That said, you should really rely on a modern OS where a lot of clever people have thought long and hard to solve these issues for you.

Aaron Digulla
+1  A: 

Use memory-mapped file access rather than the usual open-seek-read/write pattern. This technique works on Windows and Unix platforms.

In this way the operating system's virtual memory system will handle the caching for you. Accesses of blocks that are already in memory will result in no disk seek or read time. Writes from memory back to disk are handled automatically and efficiently and without blocking your application.

Aaron's notes are good too as they will affect initial-load time for a chunk that's not in memory. Combine that with the memory-mapped technique -- after all it's easier to reorder chunks using memcpy() than by reading/writing from disk and attempting swapouts etc.

Jason Cohen
Thanks, I completely forgot about mmap :)
Aaron Digulla
I'm not convinced that mmap will make the actual disc access any faster. In any case, IO is done in pages, so even if you only want a few bytes, a whole page is brought in, and another page needs to be thrown out to make room. The trick is to get as much useful stuff as possible in a page.
MarkR
+1  A: 

Depending on what you mean by "hard to predict", I can think of a few options:

If you always seek based on the same block field/property, store the records on disk sorted by that field. This lets you use binary search for O(log n) efficiency.

If you seek on different block fields, consider storing an external index for each field. A b-tree gives you O(log n) efficiency. When you seek, grab the appropriate index, search it for your block's data file address and jump to it.

Better yet, if your blocks are homogeneous, consider breaking them down into database records. A database gives you optimized storage, indexing, and the ability to perform advanced queries for free.

Corbin March
A: 

Do you know that you need to go to these lengths to optimise your code? This sounds like premature optimisation. Why not write something sane first, and speed it up later when you've proven you have a bottleneck?

Jon Topper
A: 

That's an interesting challenge. Unfortunately, I don't know how to solve this out of the box, either. Corbin's approach sounds reasonable to me.

Here's a little optimization suggestion, at least: Place the most-accessed items at the center of your disk (or unfragmented file), not at the start of end. That way, seeking to lesser-used data will be closer by average. Err, that's pretty obvious, though.

Please let us know if you figure out a solution yourself.

Thomas Tempelmann
+3  A: 

On modern OSes (Windows, Linux, etc) there is absolutely nothing you can do to optimise seek times! Here's why:

  1. You are in a pre-emptive multitasking system. Your application and all it's data can be flushed to disk at any time - user switches task, screen saver kicks in, battery runs out of charge, etc.
  2. You cannot guarentee that the file is contiguos on disk. Doing Aaron's first bullet point will not ensure an unfragmented file. When you start writing the file, the OS doesn't know how big the file is goint to be so it could put it in a small space, fragmenting it as you write more data to it.
  3. Memory mapping the file only works as long as the file size is less than the available address range in your application. On Win32, the amount of address space available is about 2Gb - memory used by application. Mapping larger files usually involves un-mapping and re-mapping portions of the file, which won't be the best of things to do.
  4. Putting data in the centre of the file is no help as, for all you know, the central portion of the file could be the most fragmented bit.

To paraphrase Raymond Chen, if you have to ask about OS limits, you're probably doing something wrong. Treat your filesystem as an immutable black box, it just is what it is (I know, you can use RAID and so on to help).

The first step you must take (and must be taken whenever you're optimising) is to measure what you've currently got. Never assume anything. Verify everything with hard data.

From your post, it sounds like you haven't actually written any code yet, or, if you have, there is no performance problem at the moment.

The only real solution is to look at the bigger picture and develop methods to get data off the disk without stalling the application. This would usually be through asynchronous access and speculative loading. If your application is always accessing the disk and doing work with small subsets of the data, you may want to consider reorganising the data to put all the usful stuff in one place and the other data elsewhere. Without knowing the full problem domain it's not possible to to be really helpful.

Skizz

Skizz