views:

924

answers:

4

Hey there. I'm trying to write a small program that will read the four following bytes after the last occurrence of "0xFF 0xC0 0x00 0x11" which can be converted easily to binary or decimal. The purpose is that the 2-5 bytes following the last occurrence of that hex pattern represent the width and height of a JPEG file.

#include <stdio.h>

 int main () {
  FILE * pFile;
  long lSize;
  char * buffer;
  size_t result;

  pFile = fopen ( "pano8sample.jpg" , "rb" );
  if(pFile==NULL){
   fputs ("File error",stderr);
   exit (1);
  }

  fseek (pFile , 0 , SEEK_END);
  lSize = ftell (pFile);
  rewind (pFile);

  printf("\n\nFile is %d bytes big\n\n", lSize);

  buffer = (char*) malloc (sizeof(char)*lSize);
  if(buffer == NULL){
   fputs("Memory error",stderr);
   exit (2);
  }

  result = fread (buffer,1,lSize,pFile);
  if(result != lSize){
   fputs("Reading error",stderr);
   exit (3);
  }

  //0xFF 0xC0 0x00 0x11 (0x08)

  //Logic to check for hex/binary/dec

  fclose (pFile);
  free (buffer);
  return 0;
 }

The problem is I don't know how to read from the buffered memory recursively and use the most recently read variable as an int to compare against my binary/hex/dec.

How do I do this?

A: 

You can use the fscanf function in C/C++ if the data is encoded in ascii. If its not, you will have to write your own function that will do this. Simple way would be to read N amount of bytes from the file, search the byte string for the pattern you want then continue until EOF.

Your code actually reads the entire file all at once (unnecessary if the line you are looking for is near the top of the file.) Your code stores the file on the heap as a byte array (char is equivalent to a byte in C++) with buffer the pointer to the start of the contiguous array in memory. Manipulate the buffer array just like you would manipulate any other array.

Also, if you intend to do anything after you have read the size, make sure you free the malloced buffer object to avoid a leak.

ldog
right.. fgetc on a r+b file will return binary integer values yes?
Supernovah
because it seems to be returning junk values. How can I compare the last fgetc result to a binary 8 bit byte?
Supernovah
I don't use fgetc much but I think it returns a byte from the current position of the internal file position.
ldog
you want to cast the result you get from fgetc to an unsigned char (which means a value from 0-255) then compare it. A char has negative value representation too which is probably screwing you up.
ldog
or you can also compare to a hexidecimal value. Just predend what you want to compare with 0x and the compiler will know you want to compare hexidecimal values.
ldog
Okay I'm just about done. How do I add two chars together eg. 0xEE and 0xFF such that the result is 0xEEFF?
Supernovah
you left shift the 0xEE by 8 bits (8 bits in one byte) and then bit or the result. So 0xEEFF == (0xEE << 8) | 0xFF
ldog
+2  A: 
byte needle[4] = {0xff, 0xc0, 0x00, 0x11};
byte *last_needle = NULL;
while (true) {
  byte *p = memmem(buffer, lSize, needle, 4); 
  if (!p) break;
  last_needle = p;
  lSize -= (p + 4) - buffer;
  buffer = p + 4;
}

If last_needle is not null, you can print out last_needle+4...

Keith Randall
The `memmem()` function is not standardized by POSIX, but is available on Linux and AIX, but not on MacOS X (10.5) or Solaris 10.
Jonathan Leffler
For those without a memmem implementation, I leave it as an exercise for the reader...
Keith Randall
+1  A: 

Personally, I'd use a function that swallows one character at a time. The function will use a finite state machine to do a simple regular expression match, saving details in a either static local variables or a parameter block structure. You need two sub-blocks - one for part-matched state, and one for the last complete match - each indicating the relevant positions or value as needed.

In this case, you should be able to design this manually. For more complex requirements, look at Ragel.

Steve314
+1  A: 

instead of reading the entire file into memory, I would use a bit of a state machine. My C is a bit rusty, but:

char searchChars[] = {0xFF,0xC0,0x00,0x11};
char lastBytes[5];
int pos = 0; int curSearch = 0;
while(pos <= lSize) {
    curChar = getc(pfile); pos++;            /*readone char*/

    if(curChar == searchChars[curSearch]) { /*found a match*/
        curSearch++;                        /*search for next char*/
        if(curSearch > 3) {                 /*found the whole string!*/
            curSearch = 0;                  /*start searching again */
            read = fread(lastBytes,1,5,pfile); /*read 5 bytes*/
            pos += read;                      /*advance position by how much we read*/
        }
    } else { /*didn't find a match
        curSearch = 0;                     /*go back to searching for first char*/
    }
 }

at the end, you're left with 5 bytes in lastBytes which are the five bytes right after the last time you find searchChars

Igor