but in the above case it is returning sufficient data but the length reported is all wrong
No, you are wrong about that. It is returning what it says it's returning, 89 bytes. The problem is that those 89 bytes don't include a nul terminator so that, when you printf
the buffer, it keeps going, printing whatever was already in the rest of the buffer before your read
happened.
What you should be doing (but see caveat below) is something like:
len = read(FD, buf, 1500);
printf ("%*.*s\n", len, len, buf);
to ensure you don't print beyond the end of the buffer.
What you're seeing is equivalent to:
char buff[500];
strcpy (buff, "Hello there");
memcpy (buff, "Goodbye", 7);
printf ("%s", buff);
Because you're not transferring the nul character in the memcpy, the buffer you're left with is:
+---+---+---+---+---+---+---+---+---+---+---+---+
After sprintf: | H | e | l | l | o | | t | h | e | r | e | \0|
+---+---+---+---+---+---+---+---+---+---+---+---+
After memcpy : | G | o | o | d | b | y | e | h | e | r | e | \0|
+---+---+---+---+---+---+---+---+---+---+---+---+
giving the string "Goodbyehere"
.
Caveat:
If there are nul characters within your data stream, that printf
won't work since it'll stop at the first nul character it finds. The read
function reads binary data from a file descriptor and it doesn't have to stop at the first newline or nul character.
That would be equivalent to:
char buff[500];
strcpy (buff, "Hello there");
memcpy (buff, "Go\0dbye", 8);
printf ("%s", buff);
+---+---+---+---+---+---+---+---+---+---+---+---+
After sprintf: | H | e | l | l | o | | t | h | e | r | e | \0|
+---+---+---+---+---+---+---+---+---+---+---+---+
After memcpy : | G | o | \0| d | b | y | e | \0| e | r | e | \0|
+---+---+---+---+---+---+---+---+---+---+---+---+
giving the string "Go"
.
If you really want to process nul- or newline-terminated string on what is a binary channel, the following (pseudo-code) is one way to do it:
while true:
while buffer has no terminator character:
read some more data into buffer, break on error or end-of-file.
break on error or end-of-file.
while buffer has at least one terminator character:
process data up to first terminator character.
remove that section from buffer.
It's a process that reads data until you have at least one "unit of work", then processes those units of work until you don't have a complete unit of work left.