views:

291

answers:

6

I am trying to cast a datastream into a struct since the datastream consists of fixed-width messages and each message has fulle defined fixed width fields as well. I was planning on creating a struct and then using reinterpret_cast to cast pointer to the datastream to the struct to get the fields. I made some test code and get weird results. Could any explain why I am getting these or how to correct the code. (the datastream will be binary and alpha numeric mixed but im just testing with strings)

#pragma pack(push,1)
struct Header 
{
    char msgType[1];
    char filler[1];
    char third[1];
    char fourth[1];
};
#pragma pack(pop)

int main(void)
{
    cout << sizeof(Header) << endl;

    char* data = "four";
    Header* header = reinterpret_cast<Header*>(data);
    cout << header->msgType << endl;
    cout << header ->filler << endl;
    cout << header->third << endl;
    cout << header->fourth << endl;
    return 0;
}

The result that are coming up are

4
four
our
ur
r

I think the four, our and ur is printing since it cant find the null terminator. How do I get around the null terminator issue?

+2  A: 

You're right about the lack of null terminator. The reason it's printing "ur" again is because you repeated the header->third instead of header->fourth. Instead of "char[1]", why not just declare those variables as "char"?

struct Header 
{
    char msgType;
    char filler;
    char third;
    char fourth;
};
Paul Tomblin
I have around 20 different messages that have 300 or more fields collectively so I use a script to generate the structs for me and it the script puts 1 there since there are some fields that would need to be just 2 characters wide and so on..
TP
+2  A: 

The issue is not reinterpret_cast (although using it is a very bad idea) but in the types of the things in the struct. They should be of type 'char', not of type 'char[1]'.

anon
char works. What if i want the first field to be 2 characters long and the rest only 1. what do I use there then?
TP
Then you are stuffed, if you still want to use reinterpret_cast on a string that doesn't contain null terminators. If it can contain null terminators, make the first one char[3] and the string "fo\0ur" or something similar.
anon
any suggestions on a better alternative to reinterpret_cast?
TP
Hard to say without knowing more about your problem. But providing your structs with a constructor that explicitly parses the input string would be a start.
anon
A: 
#pragma pack(push,1)
template<int N>
struct THeader 
{
    char msgType[1+N];
    char filler[1+N];
    char third[1+N];
    char fourth[1+N];
};
typedef THeader<0> Header0;
typedef THeader<1> Header1;  
Header1 Convert(const Header0 & h0) {
   Header1  h1 = {0};
   std::copy(h0.msgType, h0.msgType + sizeof(h0.msgType)/sizeof(h0.msgType[0]), h1.msgType);
   std::copy(h0.filler, h0.filler+ sizeof(h0.filler)/sizeof(h0.filler[0]), h1.filler);
   std::copy(h0.third , h0.third + sizeof(h0.third) /sizeof(h0.third [0]), h1.third);
   std::copy(h0.fourth, h0.fourth+ sizeof(h0.fourth)/sizeof(h0.fourth[0]), h1.fourth);
   return h1;
}
#pragma pack(pop)


int main(void)
{
  cout << sizeof(Header) << endl;
  char* data = "four";
  Header0* header0 = reinterpret_cast<Header*>(data);
  Header1 header = Convert(*header0);
  cout << header.msgType << endl;
  cout << header.filler << endl;
  cout << header.third << endl;
  cout << header.fourth << endl;
  return 0;
}
Alexey Malistov
+3  A: 

In order to be able to print an array of chars, and being able to distinguish it from a null-terminated string, you need other operator<< definitions:

template< size_t N >
std::ostream& operator<<( std::ostream& out, char (&array)[N] ) {
     for( size_t i = 0; i != N; ++i ) out << array[i];
     return out;
}
xtofl
This is interesting. What if some strings are null terminated and some are not since a field of length 8 may have only 1 character and rest nulls to pad it but might even have all 8 characters without null termination. This would print 8 characters in both cases. how can I print 8 if full, and less than 8 if null terminated?
TP
You could add a `if (array[i]==0 ) break;` in the loop body.
xtofl
i changed your loop statement to for( size_t i = 0; (i != N ++i ) out << array[i]; and it works pretty nicely. Thanks
TP
A: 

In my experience, using #pragma pack has caused headaches -- partially due to a compiler that doesn't correctly pop, but also due to developers forgetting to pop in one header. One mistake like that and structs end up defined differently depending on which order headers get included in a compilation unit. It's a debugging nightmare.

I try not to do memory overlays for that reason -- you can't trust that your struct is properly aligned with the data you are expecting. Instead, I create structs (or classes) that contain the data from a message in a "native" C++ format. For example, you don't need a "filler" field defined if it's just there for alignment purposes. And perhaps it makes more sense for the type of a field to be int than for it to be char[4]. As soon as possible, translate the datastream into the "native" type.

Dan
the filler is defined in the message protocol that I am working with.
TP
I understand in the byte stream representation, there is a filler byte. But I'd bet in your application code, you never have a need to set that filler byte, or examine its value. It's just padding. If you manually extracted bytes from the stream into your "native" structures, there would be no need to copy the filler byte, you could just skip over it in the stream. That's what I was trying to say.
Dan
A: 

Assuming you want to keep using an overlayable struct (which is sensible, since it avoids the copy in Alexey's code), you can replace your raw char arrays with a wrapper like the following:

template <int N> struct FixedStr {
    char v[N];
};

template <int N>
std::ostream& operator<<( std::ostream& out, FixedStr const &str) {
    char const *nul = (char const *)memchr(str.v, 0, N);
    int n = (nul == NULL) ? N : nul-str.v;
    out.write(str.v, n);
    return out;
}

Then your generated structures will look like:

struct Header 
{
    FixedStr<1> msgType;
    FixedStr<1> filler;
    FixedStr<1> third;
    FixedStr<40> forty;
};

and your existing code should work fine.

NB. you can add methods to FixedStr if you want (eg, std::string FixedStr::toString()) just don't add virtual methods or inheritance, and it will overlay fine.

Useless
This could work but I will need structs with different sized elements ranging from 1 - 40 so a template would not work there.
TP
FixedStr<1> to FixedStr<40>, you mean?You just move the field width from the square brackets in your code, to the angle brackets in mine.
Useless
... also added support for nul-padded fields.
Useless