views:

181

answers:

5

It is obvious that in general the read(2) system call can return less bytes than what was asked to be read. However, quite a few programs assume that when working with a local files, read(2) never returns less than what was asked (unless the file is shorter, of course).

So, my question is: on Linux, in which cases can read(2) return less than what was requested if reading from an open file and EOF is not encountered and the amount being read is a few kilobytes at maximum?

Some guesses:

  • Can received signals interrupt a read like that, but not make it fail?
  • Can different filesystems affect this behavior? Is there anything special about jffs2?
+1  A: 

A received signal only makes read() fail if it hasn't yet read a single byte. Otherwise, it will return partial data.

And I guess alternate filesystems may indeed return short reads in other situations. For example, it makes some sense (to me) to have a network-based filesystem behave just like a network socket wrt short reads (= having them often).

Nicolás
Thanks, this was helpful! Though the information on interruptible and uninterruptible filesystems was even more helpful.
Nakedible
+2  A: 

I have to ask: "why do you care about the reason"? If read can return a number of bytes less than the requested amount (which, as you point out, it certainly can) why would you not want to deal with that situation?

anon
To add, you are going to check the data anyways - so if it is short, you'll know immediately. Otherwise, what else is the reason for reading?
0A0D
Neil, I have to ask: why do you care why he wants to know how this can happen? Even if he deals with this situation it is still very helpful to know how it can happen, e.g. so that he can try it and test that his code handles it as expected. And if it isn't his own personal code that is not handling this case, this information would be needed as part of the instructions to reproduce the problem that should accompany any bug report or patch submission.
mark4o
The reason I'm asking is that we are seeing this behavior on an installed base of thousands of systems and we need to be assess as accurately as possible how common this problem is likely to be in the long run.Understanding how or why it happens is a part of the investigation.
Nakedible
+1  A: 

If it's really a file you are reading, then you can get short read as the last read before end of file.

Howver, it's generally best to behave as if ANY read could be a short read. If what you are reading is a pipe or an input device (stdin) rather than a file, you can get a short read whenever your buffer is larger than what is currently in the input buffer.

John Knoeller
What I meant by not encountering EOF is exactly that it is not the last read before the end of the file. Also, the file in question is a regular file.
Nakedible
+5  A: 

POSIX.1-2008 states:

The value returned may be less than nbyte if the number of bytes left in the file is less than nbyte, if the read() request was interrupted by a signal, or if the file is a pipe or FIFO or special file and has fewer than nbyte bytes immediately available for reading.

Disk-based filesystems generally use uninterruptible reads, which means that the read operation generally cannot be interrupted by a signal. Network-based filesystems sometimes use interruptible reads, which can return partial data or no data. (In the case of NFS this is configurable using the intr mount option.) They sometimes also implement timeouts.

Keep in mind that even /some/arbitrary/file/path may refer to a FIFO or special file, so what you thought was a regular file may not be. It is therefore good practice to handle partial reads even though they may be unlikely.

mark4o
Thank you. If this is correct, then this we have some more debugging to do.We are getting a confirmed short read on a jffs2 filesystem (which should not have interruptible reads I guess), and the file is definitely a regular file. The situation happens at most once a year, so reproducibility is low.
Nakedible
A: 

I am not sure but this situation could arise when the OS is running out of pages in the page cache. You could suggest that flush thread will be invoked in that case, but it depends on the heuristic used in the I/O scheduler. This situation could cause a read to return fewer bytes.

Algorist