views:

77

answers:

2

Hi, I have 1 GB file of tables with data separated by columns. I have parsed it and stored in hash. Later on I am using this hash for my further work. But during developing my code each time I compile for testing the " parsing and storing into hash" is executed and which makes my program slow.

Is there any way where I can store it so that I need not compile it again and again.

+2  A: 

Not really. That information has to be loaded into memory somehow. Nevertheless, serializing the hash object to disk can help, since the deserialization is probably faster than your code.

You could check out freeze or check wikipedia on Serialization for further hints.

Check out perl documentation for FreezeThaw:

use FreezeThaw qw(freeze thaw cmpStr safeFreeze cmpStrHard);
$string = freeze $data1, $data2, $data3;
...
($olddata1, $olddata2, $olddata3) = thaw $string;
if (cmpStr($olddata2,$data2) == 0) {print "OK!"}

All you need to do now is store $string in a file once parsed, read it and thaw it!

Daren Thomas
[`Storable`](http://p3rl.org/Storable) is in core and is more widely used than `FreezeThaw`.
daxim
Sorry. It has been ten years since I last did Perl. I'm a bit rusty and not sure how it works anymore... But i really like the function names `freeze` and `thaw`.
Daren Thomas
A: 

Data in Perl are not stored in very effective fashion. It can take in worst cases up to tens (20-80) times more memory. Note that can happen only in worst case. If it would happen with your 1GB dataset you should notice. So I think it is not your case. Perl data structures are very fast, they often trade memory for speed. If memory amount in your case is reasonable you can deal with it and use straight forward approach recommended by Daren Thomas or more likely Storable recommended by daxim.

If you measure that memory consumption in your case is too big you can go with some embedded key/value storage. If you will not modify data after loaded you can use CDB_File which is little bit faster than BerkeleyDB but latter allow you modify data on fly. You can chose later also because it is more common and flexible solution.

Hynek -Pichi- Vychodil