views:

295

answers:

2

I have a text file which contains some info on extents about all the files in the file system, like below C:\Program Files\abcd.txt 12345 100 23456 200 C:\Program Files\bcde.txt 56789 50 26746 300 ...

Now i have another binary which tries to find out about extents for all the files. Now currently i am using linear search to find extent info for the files in the above mentioned text file. This is a time consuming process. Is there a better way of coding this ? Like Implementing any good data structure like BTree. If B+ Tree is used what is the key, branch factor i need to use ?

A: 

It depends on how do you want to search your file. I assume that you want to look up your info given a file name. Then a hash table or a Trie would be a good data structure to use.

The B-tree is possible but not the most convenient choice given that your keys are strings.

Laurynas Biveinis
+3  A: 

Use a database.

The key points in implementing a tree in a file are to have fixed record lengths and to use file offsets instead of pointers.

Use a database. Hmmm, SQL Lite.

Another point to consider with files is that reading in chunks of data is faster than reading individual items (regardless of whether or not the hard disk has a cache or the OS has a cache). I implemented a B+Tree, which uses pages as it's nodes.

Use a database. Databases have already been written and tested.

A more efficient design is to keep the initial node in memory. This reduces the number of fetches from the file. If your program has the space, keeping the first couple of levels in memory may also speed up execution.

Use a database.

I gave up writing a B-Tree implementation for my application because I wanted to concentrate on the other functionality of the program. I later learned that in the real world (the world where programs need to be finished on a schedule) that time should be spent on the 'core' of the application rather than accessories that have already been written and tested (a.k.a. off-the-shelf).

Thomas Matthews