views:

103

answers:

4

As part of learning C, I wrote the following code to combine directory name with file name. Eg: combine("/home/user", "filename") will result in /home/user/filename. This function is expected work across platforms (atleast on all popular linux distributions and windows 32 and 64bit).

Here is the code.

const char* combine(const char* path1, const char* path2)
{
    if(path1 == NULL && path2 == NULL) {
        return NULL;
    }

    if(path2 == NULL || strlen(path2) == 0) return path1;
    if(path1 == NULL || strlen(path1) == 0) return path2;

    char* directory_separator = "";
#ifdef WIN32
    directory_separator = "\\";
#else 
    directory_separator = "/";
#endif

    char p1[strlen(path1)];                    // (1)
    strcpy(p1, path1);                         // (2) 
    char *last_char = &p1[strlen(path1) - 1];  // (3)

    char *combined = malloc(strlen(path1) + 1 + strlen(path2));
    int append_directory_separator = 0;
    if(strcmp(last_char, directory_separator) != 0) {
        append_directory_separator = 1;
    }
    strcpy(combined, path1);
    if(append_directory_separator)
        strcat(combined, directory_separator);
    strcat(combined, path2);
    return combined;
}

I have the following questions regarding the above code.

  1. Consider the lines numbered 1,2,3. All those 3 lines are for getting the last element from the string. It looks like I am writing more code for such a small thing. What is the correct method to get the last element from the char* string.
  2. To return the result, I am allocating a new string using malloc. I am not sure this is the right way to do this. Is caller expected to free the result? How can I indicate the caller that he has to free the result? Is there a less error prone method available?
  3. How do you rate the code (Poor, Average, Good)? What are the areas that can be imrpoved?

Any help would be great.

Edit

Fixed all the issues discussed and implemented the changes suggested. Here is the updated code.

void combine(char* destination, const char* path1, const char* path2)
{
    if(path1 == NULL && path2 == NULL) {
        strcpy(destination, "");;
    }
    else if(path2 == NULL || strlen(path2) == 0) {
        strcpy(destination, path1);
    }
    else if(path1 == NULL || strlen(path1) == 0) {
        strcpy(destination, path2);
    } 
    else {
        char directory_separator[] = "/";
#ifdef WIN32
        directory_separator[0] = '\\';
#endif
        const char *last_char = path1;
        while(*last_char != '\0')
            last_char++;        
        int append_directory_separator = 0;
        if(strcmp(last_char, directory_separator) != 0) {
            append_directory_separator = 1;
        }
        strcpy(destination, path1);
        if(append_directory_separator)
            strcat(destination, directory_separator);
        strcat(destination, path2);
    }
}

In the new version, caller has to allocate enough buffer and send to combine method. This avoids the use of malloc and free issue. Here is the usage

int main(int argc, char **argv)
{
    const char *d = "/usr/bin";
    const char* f = "filename.txt";
    char result[strlen(d) + strlen(f) + 2];
    combine(result, d, f);
    printf("%s\n", result);
    return 0;
}

Any suggestions for more improvements?

+1  A: 

A quick glance shows:

  1. you are using C++ comments (//) which is not standard C
  2. you are declaring variables part way down the code - also not C. They should be defined at the start of the function.
  3. your string p1 at #1 has 1 too many bytes written to it at #2 because strlen returns the length of a string and you need 1 more byte for the null terminator.
  4. the malloc does not allocate enough memory - you need length of path1 + length of path2 + length of separator + null terminator.
bohica
Thanks. The comments in the code is just for explanation. I don't have it on real code. But with `//` style comments, I am able to compile my program on a C compiler. 2. I didn't get this point. Why should it be defined at the begining of the function? Again this code compiles without any warning.
Appu
The first two points are correct, in that ANSI C only supports comments using `/* */` pairs and variable declarations at the top of the scope. But many modern compilers violate both of these to ease programming. That is, they accept `//` comments and will automatically reorder variable declarations when compiling.
jdmichal
C++ comments and intermingled declarations are valid C99.
Randy Proctor
+2  A: 
  1. The only time you use last_char is in the comparision to check if the last character is a separator.

Why not replace it with this:

/* Retrieve the last character, and compare it to the directory separator character. */
char directory_separator = '\\';
if (path1[strlen(path1) - 1] == directory_separator)
{
    append_directory_separator = 1;
}

If you want to account for the possibility of multiple character separators, you can use the following. But be sure when allocating the combined string to add strlen(directory_separator) instead of just 1.

/* First part is retrieving the address of the character which is
   strlen(directory_separator) characters back from the end of the path1 string.
   This can then be directly compared with the directory_separator string. */
char* directory_separator = "\\";
if (strcmp(&(path1[strlen(path1) - strlen(directory_separator)]), directory_separator))
{
    append_directory_separator = 1;
}
  1. The less error-prone method would be to have the user give you the destination buffer and its length, much the way strcpy works. This makes it clear that they must manage allocating and freeing the memory.

  2. The process seems decent enough. I think there's just some specifics that can be worked on, mostly with doing things in a clunky way. But you are doing well, in that you can already recognize that happening and ask for help.

jdmichal
Thanks. If caller provides the buffer, how he knows about how much to allocate? Of course, caller knows about both paths but not the directory separator. So I am wondering will that be a good choice? All other suggestions are excellent. Much appreciated.
Appu
@Appu: In Unix, MAX_PATH_LEN is a reasonable buffer size.
Joseph Quinsey
Similarly, Windows has a MAX_PATH, which is 260 characters.
jdmichal
If I make `directory_separator` a `char` instead of `char*`, how do I use it in `strcat`?
Appu
I updated the post with latest code. Please take a look again.
Appu
jdmichal
+2  A: 

And there is a memory leak:

const char *one = combine("foo", "file");
const char *two = combine("bar", "");
//...
free(one);   // needed
free(two);   // disaster!

Edit: Your new code looks better. Some minor stylistic changes:

  • Double semi-colon ;; in line 4.
  • In line 6, replace strlen(path2) == 0 with path2[0] == '\0'' or just !path2[0].
  • Similarly in line 9.
  • Remove loop determining last_char, and use const char last_char = path1[strlen(path1) - 1];
  • Change if(append_directory_separator) to if(last_char != directory_separator[0]). And so you don't need the variable append_directory_separator any more.
  • Have your function also return destination, similar to strcpy(dst, src), which returns dst.

Edit: And your loop for last_char has a bug: it always returns the end of path1, and so you could end up with a double slash // in your answer. (But Unix will treat this as a single slash, unless it is at the start). Anyway, my suggestion fixes this--which I see is quite similar to jdmichal's answer. And I see that you had this correct in your original code (which I admit I only glanced at--it was too complicated for my taste; your new code is much better).

And two more, slightly-more subjective, opinions:

  • I would use stpcpy(), to avoid the inefficiency of strcat(). (Easy to write your own, if need be.)
  • Some people have very strong opinions about strcat() and the like as being unsafe. However, I think your usage here is perfectly fine.
Joseph Quinsey
Good point. I was aware about the leaks. But the disaster one was new to me. Can you please explain how that will make problem?
Appu
@Appu: `free(two)` is the same as `free("bar")`. Which may not be a _disaster_. But it is undefined.
Joseph Quinsey
Since you shortcut and return `path1` or `path2` if the other parameter is empty, it is unknown to the user whether they should free the returned value or not. Good catch.
jdmichal
I updated the post with latest code. Please take a look again.
Appu
Thanks for the answer. I solved the problem and learned lot of stuff. Thanks again.
Appu
A: 
char p1[strlen(path1)];

Can you really declare char arrays like that?

Brian Hooper
C99, _yes_. MSVC, no (I believe).
Joseph Quinsey
Thanks. I didn't know that.
Brian Hooper