ansaurus

Question

Can you represent CSV data in Google's Protocol Buffer format?

Answer 1

A:

Well, protobuf-net (my version) is based on regular .NET types, so no (since it won't cope with different schemas all the time). But Jon's version might allow dynamic types. Personally, I'd just use CSV and run it through GZipStream - I expect that will be fine for the purpose.

Edit: actually, I forgot: protobuf-net does support extensible objects, but you need to be a bit careful... it would depend on the full context, I expect.

Plus Jon's approach of nested data would probably work too.

Marc Gravell 2008-12-16 14:32:19

Sorry, not sure if I made it clear - I'm also adding extra data to the CSV, sometimes as extra columns and sometimes as header or footer data. This data I'd like to version proof. That's why I was thinking about other methods of storage.

Cameron MacFarland 2008-12-16 14:42:04

Answer 2

+2 A:

Well, it's certainly representable. Something like:

message CsvFile {
    repeated CsvHeader header = 1;
    repeated CsvRow row = 2;
}

message CsvHeader {
    require string name = 1;
    require ColumnType type = 2;
}

enum ColumnType {
    DECIMAL = 1;
    STRING = 2;
}

message CsvRow {
    repeated CsvValue value = 1;
}

// Note that the column is implicit based on position within row    
message CsvValue {
    optional string string_value = 1;
    optional Decimal decimal_value = 2;
}

message Decimal {
    // However you want to represent it (there are various options here)
}

I'm not sure how much benefit it will provide, mind you... You can certainly add more information (add to the CsvFile message) and future proofing is in the "normal PB way" - only add optional fields, etc.

Jon Skeet 2008-12-16 14:38:07

Yeah reading about the encoding of PBs didn't fill me with hope as my data is mainly dense numbers. Still I'll give it a shot and see what happens.

Cameron MacFarland 2008-12-16 14:49:27

If you're interested in System.Decimal representations in PB, that probably deserves a separate question - or a post on the PB discussion group. Marc and I have discussed this before (and might do more tonight - Marc?).

Jon Skeet 2008-12-16 14:55:25

@Jon - quite probably ;-p

Marc Gravell 2008-12-16 16:29:53

ansaurus

tags:

views:

answers:

Can you represent CSV data in Google's Protocol Buffer format?

related questions