ansaurus

Question

Answer 1

+5 A:

Check what system calls dd makes, not just the open but also the subsequent reads and writes. Using the right buffer sizes can make a significant difference in this kind of large copy. Note that /dev/zero is not a good test for benchmarking if your final goal is a disk-to-disk copy.

If you can't match dd's speed by matching it system call for system call... well, read the source.

Gilles 2010-08-13 19:00:02

I am actually interested in direct writing from memory to /dev/sdb so I feel /dev/zero should work pretty well.Also what are you talking about with regards to the reads and writes? I specify the block size in the command to be 32K.

dschatz 2010-08-13 19:05:05

+1 read dd's source. That's why it's there.

Nathon 2010-08-13 19:14:27

The source is 2000 lines. Not exactly a magnum opus. Just check it out: http://git.savannah.gnu.org/cgit/coreutils.git/tree/src/dd.c. It seems it uses read, write, memcpy, memset. Nothing magic there. It does seem to have a few strategies for reading/writing, though some of those variations seem to be designed for special filesystem/OS requirements.

Merlyn Morgan-Graham 2010-08-13 19:31:03

I have looked at it and I don't see any differences:fd_reopen (STDOUT_FILENO, output_file, O_WRONLY | opts, perms)nread = iread_fnc (STDIN_FILENO, ibuf, input_blocksize);size_t nwritten = iwrite (STDOUT_FILENO, obuf, n_bytes_read);If anyone can tell me what it does that I'm not seeing then I would be grateful.

dschatz 2010-08-13 19:36:12

You don't even need `dd`'s source to see the syscalls - that's what `strace` is for

qrdl 2010-08-13 19:42:30

@dschatz: again, did you check that your code makes the exact same sequence of `read` and `write` calls (diff their `strace`s)? If they do, and if you're sure you've eliminated any caching effect in your benchmark, then you'll have to study the source harder. Try copying part of the code of `dd` and seeing if that helps. It might not be easy to find the clincher(s)!

Gilles 2010-08-13 19:56:35

Answer 2

A:

I'm leaving the part about matching the system calls to somebody else. This answer is about the buffering part.

Try benchmarking the buffer size you use. Experiment with a range of values.

When learning Java, I wrote a simple clone of 'copy' and then tried to match it's speed. Since the code did byte-by-byte read/writes the buffer size was what really made the difference. I wasn't buffering it myself but I was asking the read to fetch chunks of a given size. The bigger the chunk, the faster it went - up to a point.

As for using 32K block size, remember that the OS still uses separate IO buffers for user-mode processes. Even if you are doing something with specific hardware, i.e. you're writing a driver for a device that has some physical limitation, e.g. a CD-RW drive with sector sizes, the block size is only part of the story. The OS will still have it's buffer too.

Kelly French 2010-08-13 19:21:16

The buffer is 32k which is the same as the block size I use in dd. These translate into the same system calls so what else is there to experiment with?Also I open both with the direct flag so the OS will not buffer it.

dschatz 2010-08-13 19:24:40

ansaurus

tags:

views:

answers:

Can't reach speeds of dd

related questions