tags:

views:

191

answers:

3

Hi all. I'm stuck with an apparently harmless piece of code. I'm trying to read a whole flv video file into a uint8_t array, but by no reason only the 10 first bytes are read.

contents = malloc(size + 1);

if (read(fd, contents, size) < 0)
{
    free(contents);
    log_message(WARNING, __func__, EMSG_READFILE);
    return (NULL);
}

I've tried with fopen and "rb" also, but seems that Glibc ignores that extra 'b' or something. Any clues?

Thanks in advance.

Edit: Maybe it reads a EOF character?

PS. 'size' is a variable containing the actual file size using stat().

+3  A: 

It seems the original code correctly reads the entire content.

The problem seems to be in making use of that binary data - printing it out will truncate at the first null, making it appear that only 10 bytes are present. You can't use any methods intended for strings or character arrays to output binary data, as they will truncate at the first null byte, making it appear the array is shorter than it really is.

Check out some other questions related to viewing hex data: http://stackoverflow.com/questions/673240/how-do-i-print-an-unsigned-char-as-hex-in-c-using-ostream http://stackoverflow.com/questions/412954/converting-binary-data-to-printable-hex

If you want to append this to a string - in what format? hex? base64? Raw bytes won't work.

Here's the original code I posted. A few minor improvements, plus some better diagnostic code:

int ret, size = 4096; /* Probably needs to be much bigger */
uint8_t *contents;
contents = malloc(size + 1);

if(contents == NULL)
{
    log_message(WARNING, __func__, EMSG_MEMORY);
    return (NULL);
}

ret = read(fd, contents, size);
if(ret < 0)
{
    /* Error reading file */
    free(contents);
    log_message(WARNING, __func__, EMSG_READFILE);
    return (NULL);
}

for(i = 0;i < ret;++i)
{
    printf("%c", contents[i]);
    /* printf("%0.2X", (char) contents[i]); /* Alternatively, print in hex */
}

Now, is ret really 10? Or do you just get 10 bytes when you try to print the output?

jmanning2k
Wow... read() actually returns the whole size, but in gdb i only get this:$2 = (uint8_t *) 0xb65f2008 "FLV\001\005"
Manuel Abeledo
That's probably the buffer up until the first null byte. Try the printf I just added...Could also be a pointer issue if your declaration is different...
jmanning2k
You ARE reading the whole file then. gdb treats char arrays as strings, and will truncate output at the first null byte. Try 'p/x contents'.
jmanning2k
printf() shows the same contents.
Manuel Abeledo
It would - printf() only prints a string upto the first null. Try looping over the returned data printing each character as %u
Martin Beckett
Same thing. strlen() gives me a 5 byte length array, also.
Manuel Abeledo
@Manuel: `strlen` is useless on binary data, which a FLV video clearly is.
ephemient
And how can i concatenate this data with a string?
Manuel Abeledo
A: 

On success, read() returns the number of bytes read (which may be less than what you asked for, at which point you should ask for the rest.) On EOF it will return 0 and on error it will return -1. There are some errors for which you might want to consider re-issuing the read (eg. EINTR which happens when you get a signal during a read.)

asveikau
+2  A: 

The 'read()' function in the C library doesn't necessarily return the whole read in one shot. In fact, if you're reading very much data at all, it usually doesn't give it to you in a single call.

The solution to this is to call read() in a loop, continuing to ask for more data until you've got it all, or until read returns an error, indicated by a negative return value, or end-of-file, indicated by a zero return value.

Something like the following (untested):

contents = malloc(size + 1);

bytesread = 0;
pos = 0;
while (pos < size && (bytesread = read(fd, contents + pos, size - pos)) > 0)
{
    pos += bytesread;
}

if (bytesread < 0)
{
    free(contents);
    log_message(WARNING, __func__, EMSG_READFILE);
    return (NULL);
}

/* Go on to use 'contents' now, since it's been filled.  Should probably 
   check that pos == size to make sure the file was the size you expected. */

Note that most C programmers would do this a little differently, probably making 'pos' a pointer which gets moved along, rather than offsetting from 'contents' each time through the loop. But I thought this approach might be clearer.

swillden
+1 for calling 'read' in a loop. should check if malloc fails due.
Liran Orevi
Yes, the code should check if malloc fails. In practice, malloc virtually (hehe) never fails on Linux or other modern OSes, and if it does the program is just going to segfault as soon as the content pointer is dereferenced. Odds are if malloc does fail, your only option is to abort the program anyway. But, for correctness, the return value of malloc should be checked.
swillden