views:

228

answers:

2

Does anyone know of a library for encoding a number of primitive types (like integers, floats, strings, etc) into a string but preserving the lexicographical order of the types?

Ideally, I'm looking for a C++ library, but other languages are fine too. Also, one can assume that the format does not need to be encoded in the string itself (that is, if it's int64/string/float then the encoded string does not need to encode this information, only encoding the data is enough).

A: 

Just write numeric values in a fixed column width with leading zeros, and strings as normal. So like this:

0.1 -> 0000000.1000000
123 -> 0000123.0000000
foo -> foo
X   -> X

Then you can sort as text (e.g. Unix sort without -n). How about that?

John Zwinck
I would like to avoid encoding numbers in fixed width. Also, encoding strings as themselves won't work give the right sorting order if the string has the same character that you're using as a separator.
nilton
Then write your own sort routine.
John Zwinck
+1  A: 

Take a look at this paper ("Efficient Lexicographic Encoding of Numbers") which shows how to represent any numeric type as a string such the lexicographic order of the strings is the same as the numerical order of the underlying numbers. It copes with arbitrary length numbers.

http://www.zanopha.com/docs/elen.pdf

Peter
Interesting... I'm taking a look at the paper.
nilton