views:

1227

answers:

3

Hello,

basically, I've got my Huffman table as

std::map<std::string, char> ciMap;

Where string is the bit pattern and char is the value represented by said pattern. The problem is how do I store that as a header of my compressed file so I can build again the same map when I want to decode it?

Trying to store it as binary:

size_t mapLen = ciMap.size();
outFile.write(reinterpret_cast<char*>(&mapLen), sizeof(size_t));
outFile.write(reinterpret_cast<char*>(&ciMap), sizeof(ciMap));

And later building with:

inFile.read(reinterpret_cast<char*>(&mapLen), sizeof(size_t));
inFile.read(reinterpret_cast<char*>(&ciMap), sizeof(mapLen));

Doesn't work, I get string initilization error... something to do with NULL. Any suggestions? If you have better way of storing the bits and values I'd like to hear.

+2  A: 

Great question. Problem here is that the default containers don't support serialization - you have to write it yourself, it's a pain, but it's possible.

Here's how you could serialize a std::map to a textual format. You can adapt it to write to whatever binary format you need. Just replace the << operator with reads and writes.

template<typename K, typename V>
std::ostream &operator << (std::ostream &out, const std::map<K,V> &map) {
    out << "map " << map.size() << "\n";
    for (typename std::map<K,V>::const_iterator i = map.begin(); i != map.end(); ++i) {
        out << (*i).first << "\n" << (*i).second << "\n";
    }
    return out;
}

template<typename K, typename V>
std::istream &operator >> (std::istream &in, std::map<K,V> &map) {
    std::string mapkeyword;
    size_t num;
    in >> mapkeyword >> num;
    for (size_t i = 0; i < num; ++i) {
        K key; V value;
        in >> key >> value;
        map[key] = value;
    }
    return in;
}
Frank Krueger
+2  A: 

You can't just serialize the binary values to disk in this way. The in memory representation is not simply a contiguous block of memory, and even if it was it will likely contain pointers which are relative to the address of the block.

You need to iterate over the map and serialize out each item individually. Then to bring them back in you reconstruct the map by reading the items off disk one by one and reinserting them into the map.

Rob Walker
+4  A: 

You can do it yourself, or you can do it with boost: http://www.boost.org/doc/libs/1_37_0/libs/serialization/doc/index.html. What you currently try is just view the map as a plain old datatype, which essentially means it's a C datatype. But it isn't, so it fails to save/load. boost serialization does it correctly. Have a look at it. If you don't want to use it, you can do something like this:

typedef std::map<std::string, char> my_map;
my_map ciMap;

// saving
std::ofstream stream("file.txt");
for(my_map::const_iterator it = ciMap.begin(); it != ciMap.end(); ++it) {
    stream << it->first << " " << it->second << std::endl;
}

// loading
char c;
std::string bits;
std::ifstream stream("file.txt");
while(stream >> bits >> c)
    ciMap.insert(std::make_pair(bits, c));

Note that the above needs some changes if the characters stored could be whitespace characters too. Because of that, it's probably the best to first convert to an int before writing out, and then reading as an int when loading. Actually, i recommend boost serialization, and boost iostreams (http://www.boost.org/doc/libs/1_37_0/libs/iostreams/doc/index.html), which includes a compression stream that transparently can compress your data too.

Johannes Schaub - litb