tags:

views:

415

answers:

3

I have some existing code that I've used to write out an image to a bitmap file. One of the lines of code looks like this:

bfh.bfType='MB';

I think I probably copied that from somewhere. One of the other devs says to me "that doesn't look right, isn't it supposed to be 'BM'?" Anyway it does seem to work ok, but on code review it gets refactored to this:

bfh.bfType=*(WORD*)"BM";

A google search indicates that most of the time, the first line seems to be used, while some of the time people will do this:

bfh.bfType=0x4D42;

So what is the difference? How can they all give the correct result? What does the multi-byte character constant mean anyway? Are they the same really?

+1  A: 

I did not find the API, but according to http://cboard.cprogramming.com/showthread.php?t=24453, the bfType is a bitmapheader. A value of BM would most likely mean "bitmap".

0x4D42 is a hexadecimal value (0x4D for M and 0x42 for B). In the little endian way of writing (least significate byte first), that would be the same as "BM" (not "MB"). If it also works with "MB" then probably some default value is taken.

tehvan
A: 

Addendum to tehvan's post:

From Wikipedia's entry on BMP:

File header Note that the first two bytes of the BMP file format (thus the BMP header) are stored in big-endian order. This is the magic number 'BM'. All of the other integer values are stored in little-endian format (i.e. least-significant byte first).

So it looks like the refactored code is correct according to the specification.

Have you tried opening the file with 'MB' as the magic number with a few different photo-editors?

dirkgently
+2  A: 

All three are (probably) equivalent, but for different reasons.

bfh.bfType=0x4D42;

This is the simplest to understand, it just loads bfType with a number that happens to represent ASCII 'M' in bits 8-15 and ASCII 'B' in bits 0-7. If you write this to a stream in little-endian format, then the stream will contain 'B', 'M'.

bfh.bfType='MB';

This is essentially equivalent to the first statement -- it's just a different way of expressing an integer constant. It probably depends on the compiler exactly what it does with it, but it will probably generate a value according to the endian-ness of the machine you compile on. If you compile and execute on a machine of the same endian-ness, then when you write the value out on the stream you should get 'B', 'M'.

bfh.bfType=*(WORD*)"BM";

Here, the "BM" causes the compiler to create a block of data that looks like 'B', 'M', '\0' and get a char* pointing to it. This is then cast to WORD* so that when it's dereferenced it will read the memory as a WORD. Hence it reads the 'B', 'M' into bfType in whatever endian-ness the machine has. Writing it out using the same endian-ness will obviously put 'B', 'M' on your stream. So long as you only use bfType to write out to the stream this is the most portable version. However, if you're doing any comparisons/etc with bfType then it's probably best to pick an endian-ness for it and convert as necessary when reading or writing the value.

Dave