ansaurus

Question

Answer 1

+1 A:

One option is that you could use system() to execute cp. This just re-uses the cp(1) command to do the work. If you only need to make another link the the file, this can be done with link() or symlink().

ConcernedOfTunbridgeWells 2010-02-01 21:10:01

beware that system() is a security hole.

plinth 2010-02-01 21:12:35

You said "if", but link won't work across file systems, FYI.

Roboprog 2010-02-01 21:14:24

Really? Would you use this in production code? I can't think of a good reason not to but it doesn't strike me as a _clean_ solution.

Motti 2010-02-01 21:14:37

If you specify the path to /bin/cp you're relatively safe, unless the attacker has managed to compromise the system to the extent that they can make modifications to arbitrary system shell utilities in /bin. If they've compromised the system to that extent you've got far bigger problems.

ConcernedOfTunbridgeWells 2010-02-01 21:14:50

Using system to run commands is fairly common in unix-land. With proper hygiene it can be reasonably secure and robust. After all, the commands are designed to be used in this way.

ConcernedOfTunbridgeWells 2010-02-01 21:16:58

@Roboprog - true; if you need to cross filesystems you would need symlink().

ConcernedOfTunbridgeWells 2010-02-01 21:17:52

What will happen if the user creates a file name like "somefile;rm /bin/*"? system() executes the command with sh -c so the text of the entire string is executed by the shell, which means you'd get anything after a semicolon executed as a command - stinks if your code is running setuid too. This is not unlike Bobby Tables (http://xkcd.com/327/). For the trouble it would take to fully sanitize system() you could instead do the fork/exec pair directly on /bin/cp with the correct arguments.

plinth 2010-02-01 21:27:30

plinth: I agree that using `system()` in this way is generally a bad idea, but note that `/` is one of the only two characters *not* allowed in a UNIX filename.

caf 2010-02-01 21:49:21

Alas, in sanitizing for a system call, 'taint perl :-(

Roboprog 2010-02-06 08:51:41

Answer 2

+1 A:

this is tagged as C, if you're in c++, but happened to mis-tag this question there is: http://stackoverflow.com/questions/829468/how-to-perform-boostfilesystem-copyfile-with-overwrite

in C, maybe there is something in GLib.

Chris H 2010-02-01 21:11:30

Answer 3

+1 A:

sprintf( cmd, "/bin/cp -p \'%s\' \'%s\'", old, new);

system( cmd);

Add some error checks...

Otherwise, open both and loop on read/write, but probably not what you want.

Roboprog 2010-02-01 21:11:32

Dang, I've got to learn to "submit" faster :-)

Roboprog 2010-02-01 21:15:16

This does not work for files that have spaces (or quotes, backslashes, dollar signs, etc.) in the name. I use spaces in file names fairly often.

Dietrich Epp 2010-02-01 21:44:27

Ouch. That's right. Add backslash-single-quotes around the file names in the sprintf().

Roboprog 2010-02-01 21:45:44

OK, this is a swiss cheese (see valid security concerns in comments elsewhere), but if you have a relatively controlled environment, it might have some use.

Roboprog 2010-02-01 21:47:28

You have a shell code injection vulnerability if you do not properly handle single quote characters in the values of `old` or `new`. A little more effort to use fork and do your own exec can avoid all these problems with quoting.

Chris Johnsen 2010-02-02 01:59:21

Yep, simple obvious and wrong, in many cases. Which is why I up-voted some of the more elaborate examples.

Roboprog 2010-02-02 18:19:55

Answer 4

+4 A:

There is no baked-in equivalent CopyFile function in the APIs. But sendfile can be used to copy a file in kernel mode which is a faster and better solution (for numerous reasons) than opening a file, looping over it to read into a buffer, and writing the output to another file.

Here's some code I grabbed from a project I'm working on:

#include <sys/socket.h>
#include <fcntl.h>

int inline BLCopyFile(const char* source, const char* destination)
{
    //Here we use kernel-space copying for performance reasons

    int input, output;

    if( (input = open(source, O_RDONLY)) == -1)
        return 0;

    if( (output = open(destination, O_WRONLY | O_CREAT)) == -1)
    {
        close(input);
        return 0;
    }

    off_t bytesCopied;

    int result = sendfile(output, input, 0, &bytesCopied, 0, 0) == -1;

    close(input);
    close(output);

    return result;
}

Computer Guru 2010-02-01 21:17:27

According to the man page, the output argument of `sendfile` must be a socket. Are you sure this works?

Jay Conrod 2010-02-01 21:27:13

The prototype from my man page (OS X):`int sendfile(int fd, int s, off_t offset, off_t *len, struct sf_hdtr *hdtr, int flags);`The output param is fd - file descriptor.At any rate, I tested it quickly (hence the updated non-C++ version) and it worked :)

Computer Guru 2010-02-01 21:38:57

For Linux, Jay Conrod is right - the `out_fd` of `sendfile` could be a regular file in 2.4 kernels, but it now must support the `sendpage` internal kernel API (which essentially means pipe or socket). `sendpage` is implemented differently on different UNIXes - there's no standard semantics for it.

caf 2010-02-01 21:45:01

@Computer Guru: The prototype under Linux is different to OSX, hence you would think that (and I thought that too) that when I saw your implementation and saw the extra parameters for the sendfile...it is platform dependant - something worth bearing in mind about!

tommieb75 2010-02-01 21:59:31

fyi - you can save a lot of work with a if (PathsMatch(source, destination)) return 1; /* where PathsMatch is the appropriate path comparison routine for the locale */, otherwise I imagine that the second open would fail.

plinth 2010-02-02 01:44:14

Answer 5

+3 A:

tommieb75 2010-02-01 21:25:34

I am not 100% sure about the sendfile prototype, I think I got one of the parameters wrong... please bear that in mind... :)

tommieb75 2010-02-01 21:29:25

+1, good one (reusable routine and all)

Roboprog 2010-02-01 21:49:39

You have a race condition - the file you have open as `fdSource` and the file you have `stat()ed` are not necessarily the same.

caf 2010-02-01 22:30:28

@caf: Can you give more details as I am looking at it and how can there be a race condition? I will amend the answer accordingly..thanks for letting me know...

tommieb75 2010-02-01 23:53:03

tommbieb75: Simple - in between the `open()` call and the `stat()` call, someone else could have renamed the file and put a different file under that name - so you will copy the data from the first file, but using the length of the second one.

caf 2010-02-02 00:35:00

@caf: Holy moly....why didn't I think of that...well spotted...a lock should do the trick on the source file...well done for spotting that...race condition..well I never...as Clint Eastwood in 'Gran Torino' says 'J.C all friday...'

tommieb75 2010-02-02 00:57:21

A lock doesn't help (they're not mandatory), but `fstat` can be used in this case to fix it.

caf 2010-02-02 01:33:21

@caf: Damnnit..... I just saw your comment after I edited my answer in the code.... dang.... LOL!!!!

tommieb75 2010-02-02 01:46:49

@Caf: Feel free to edit the code if you wish! :)

tommieb75 2010-02-02 01:52:24

Answer 6

+3 A:

It's straight forward to use fork/execl to run cp to do the work for you. This has advantages over system in that it is not prone to a Bobby Tables attack and you don't need to sanitize the arguments to the same degree. Further, since system() requires you to cobble together the command argument, you are not likely to have a buffer overflow issue due to sloppy sprintf() checking.

The advantage to calling cp directly instead of writing it is not having to worry about elements of the target path existing in the destination. Doing that in roll-you-own code is error-prone and tedious.

I wrote this example in ANSI C and only stubbed out the barest error handling, other than that it's straight forward code.

void copy(char *source, char *dest)
{
    int childExitStatus;
    pid_t pid;

    if (!source || !dest) {
        /* handle as you wish */
    }

    pid = fork();

    if (pid == 0) { /* child */
        execl("/bin/cp", "/bin/cp", source, dest, (char *)0);
    }
    else if (pid < 0) {
        /* error - couldn't start process - you decide how to handle */
    }
    else {
        /* parent - wait for child - this has all error handling, you
         * could just call wait() as long as you are only expecting to
         * have one child process at a time.
         */
        pid_t ws = waitpid( pid, &childExitStatus, WNOHANG);
        if (ws == -1)
        { /* error - handle as you wish */
        }

        if( WIFEXITED(childExitStatus)) /* exit code in childExitStatus */
        {
            int status = WEXITSTATUS(childExitStatus); /* zero is normal exit */
            /* handle non-zero as you wish */
        }
        else if (WIFSIGNALED(status)) /* killed */
        {
        }
        else if (WIFSTOPPED(status)) /* stopped */
        {
        }
    }
}

plinth 2010-02-01 21:54:19

+1 for another long, detailed, slog. Really makes you appreciate the "vector"/list form of system() in perl. Hmm. Maybe a system-ish function with an argv array would be nice to have?!?

Roboprog 2010-02-01 23:20:37

Answer 7

+3 A:

There is no need to either call non-portable APIs like sendfile, or shell out to external utilities. The same method that worked back in the 70s still works now:

#include <fcntl.h>
#include <unistd.h>
#include <errno.h>

int cp(const char *to, const char *from)
{
    int fd_to, fd_from;
    char buf[4096];
    ssize_t nread;
    int saved_errno;

    fd_from = open(from, O_RDONLY);
    if (fd_from < 0)
        return -1;

    fd_to = open(to, O_WRONLY | O_CREAT | O_EXCL, 0666);
    if (fd_to < 0)
        goto out_error;

    while (nread = read(fd_from, buf, sizeof buf), nread > 0)
    {
        char *out_ptr = buf;
        ssize_t nwritten;

        do {
            nwritten = write(fd_to, out_ptr, nread);

            if (nwritten >= 0)
            {
                nread -= nwritten;
                out_ptr += nwritten;
            }
            else if (errno != EINTR)
            {
                goto out_error;
            }
        } while (nread > 0);
    }

    if (nread == 0)
    {
        if (close(fd_to) < 0)
        {
            fd_to = -1;
            goto out_error;
        }
        close(fd_from);

        /* Success! */
        return 0;
    }

  out_error:
    saved_errno = errno;

    close(fd_from);
    if (fd_to >= 0)
        close(fd_to);

    errno = saved_errno;
    return -1;
}

caf 2010-02-01 23:16:41

@Caf: OMG....g.o.t.o..... :) Your code is more saner than mine anyways... ;) The old loop with read/write is the most portable... +1 from me...

tommieb75 2010-02-02 00:03:19

I find controlled use of `goto` can be useful to consolidate the error handling path in one place.

caf 2010-02-02 00:36:42

ansaurus

tags:

views:

answers:

How can I copy a file on Unix using C?

related questions