tags:

views:

176

answers:

8

Hi all,

I trying to do some very basic string processing in C (e.g. given a filename, chop off the file extension, manipulate filename and then add back on the extension)- I'm rather rusty on C and am getting segmentation faults.

char* fname;
char* fname_base;
char* outdir;
char* new_fname;
.....
fname = argv[1];
outdir = argv[2];
fname_len = strlen(fname);
strncpy(fname_base, fname, (fname_len-4)); // weird characters at the end of the truncation?
strcpy(new_fname, outdir); // getting a segmentation on this I think

strcat(new_fname, "/");
strcat(new_fname, fname_base);
strcat(new_fname, "_test");
strcat(new_fname, ".jpg");
printf("string=%s",new_fname);  

Any suggestions or pointers welcome.

Many thanks and apologies for such a basic question

+1  A: 

You have to malloc fname_base and new_fname, I believe.

ie:

fname_base = (char *)(malloc(sizeof(char)*(fname_len+1)));
fname_base[fname_len] = 0; //to stick in the null termination

and similarly for new_fname and outdir

piggles
+3  A: 

You need to allocate memory for new_fname and fname_base. Here's is how you would do it for new_fname:

new_fname = (char*)malloc((strlen(outdir)+1)*sizeof(char));

In strlen(outdir)+1, the +1 part is for allocating memory for the NULL CHARACTER '\0' terminator.

Pablo Santa Cruz
+1, but I think you mean "NUL CHARACTER", rather then "NULL POINTER".
David X
@David X: Thanks! Fixed that.
Pablo Santa Cruz
+1  A: 

You're using uninitialized pointers as targets for strcpy-like functions: fname_base and new_fname: you need to allocate memory areas to work on, or declare them as char array e.g.

char fname_base[FILENAME_MAX];
char new_fname[FILENAME_MAX]; 
philippe
+1  A: 

you could combine the malloc that has been suggested, with the string manipulations in one statement

if ( asprintf(&new_fname,"%s/%s_text.jpg",outdir,fname_base) >= 0 )
     // success, else failed

then at some point, free(new_fname) to release the memory.

(note this is a GNU extension which is also available in *BSD)

mvds
It's trivial to roll your own implementation of `asprintf` based on `vsnprintf`, and then your code is portable.
R..
indeed, especially if you're "rather rusty on C and [..] getting segmentation faults"
mvds
+2  A: 

In addition to what other's are indicating, I would be careful with

strncpy(fname_base, fname, (fname_len-4));

You are assuming you want to chop off the last 4 characters (.???). If there is no file extension or it is not 3 characters, this will not do what you want. The following should give you an idea of what might be needed (I assume that the last '.' indicates the file extension). Note that my 'C' is very rusty (warning!)

char *s;
s = (char *) strrchr (fname, '.');
if (s == 0)
{
    strcpy (fname_base, fname);
}
else
{
    strncpy (fname_base, fname, strlen(fname)-strlen(s));
    fname_base[strlen(fname)-strlen(s)] = 0;
}
Edward Leno
I was worried about the 4 characters etc issue, but wanted to address the segmentation fault first. Many thanks for your above code, it works great!
+1  A: 

Cleaner code:

#include <string.h>
#include <stdlib.h>
#include <stdio.h>

const char *extra = "_test.jpg";

int main(int argc, char** argv)
{
  char *fname = strdup(argv[1]); /* duplicate, we need to truncate the dot */
  char *outdir = argv[1];
  char *dotpos;
  /* ... */
  int new_size = strlen(fname)+strlen(extra);
  char *new_fname = malloc(new_size);   
  dotpos = strchr(fname, '.');
  if(dotpos)
    *dotpos = '\0'; /* truncate at the dot */
  new_fname = malloc(new_size);
  snprintf(new_fname, new_size, "%s%s", fname, extra);
  printf("%s\n", new_fname);
  return 0;
}
João Pinto
Many thanks for this solution - it looks like an elegant approach, so I'm going to use this. Thanks!
You're `malloc()` ing `new_fname` twice with no `free()` in between the calls.
Praetorian
You are right, actually the second malloc() is an error, it's not really required.I didn't care about free() because the app is exiting soon ;), the fname strdup() would also need a free().
João Pinto
You are using argv[1] for both the fname and the outdir. Also if fname has more than one `.` in it you will catch the first one, not the last one.
nategoose
...and if `fname` has no dots in it, you'll truncate a character off the end (since the `malloc()` isn't long enough).
caf
A: 

The basic of any C string manipulation is that you must write into (and read from unless... ...) memory you "own". Declaring something is a pointer (type *x) reserves space for the pointer, not for the pointee that of course can't be known by magic, and so you have to malloc (or similar) or to provide a local buffer with things like char buf[size].

And you should be always aware of buffer overflow.

As suggested, the usage of sprintf (with a correctly allocated destination buffer) or alike could be a good idea. Anyway if you want to keep your current strcat approach, I remember you that to concatenate strings, strcat have always to "walk" thourgh the current string from its beginning, so that, if you don't need (ops!) buffer overflow checks of any kind, appending chars "by hand" is a bit faster: basically when you finished appending a string, you know where the new end is, and in the next strcat, you can start from there.

But strcat doesn't allow to know the address of the last char appended, and using strlen would nullify the effort. So a possible solution could be

size_t l = strlen(new_fname);
new_fname[l++] = '/';
for(i = 0; fname_base[i] != 0; i++, l++) new_fname[l] = fname_base[i];
for(i = 0; testjpgstring[i] != 0; i++, l++) new_fname[l] = testjpgstring[i];
new_fname[l] = 0; // terminate the string...

and you can continue using l... (testjpgstring = "_test.jpg")

However if your program is full of string manipulations, I suggest using a library for strings (for lazyness I often use glib)

ShinTakezou
wow thanks for all that thorough information - its most appreciated!
+1  A: 

In the following code I do not call malloc.

#include <string.h>
#include <stdio.h>
#include <stdlib.h>

/*  Change this to '\\' if you are doing this on MS-windows or something like it. */
#define DIR_SYM '/'
#define EXT_SYM '.'
#define NEW_EXT "jpg"


int main(int argc, char * argv[] ) {
   char * fname;
   char * outdir;

   if (argc < 3) {
       fprintf(stderr, "I want more command line arguments\n");
       return 1;
   }
   fname = argv[1];
   outdir = argv[2];

   char * fname_base_begin = strrchr(fname, DIR_SYM); /* last occurrence of DIR_SYM */
   if (!fname_base_begin) {
       fname_base_begin = fname; // No directory symbol means that there's nothing
                                 // to chop off of the front.
   }

   char * fname_base_end = strrchr(fname_base_begin, EXT_SYM);
   /* NOTE: No need to search for EXT_SYM in part of the fname that we have cut off
    * the front and then have to deal with finding the last EXT_SYM before the last
    * DIR_SYM */
   if (!fname_base_end) {
       fprintf(stderr, "I don't know what you want to do when there is no extension\n");
       return 1;
   }

   *fname_base_end = '\0'; /* Makes this an end of string instead of EXT_SYM */
   /* NOTE:  In this code I actually changed the string passed in with the previous
    * line.  This is often not what you want to do, but in this case it should be ok.
    */

   // This line should get you the results I think you were trying for in your example
   printf("string=%s%c%s_test%c%s\n", outdir, DIR_SYM, fname_base_begin, EXT_SYM, NEW_EXT);

   // This line should just append _test before the extension, but leave the extension
   // as it was before.
   printf("string=%s%c%s_test%c%s\n", outdir, DIR_SYM, fname_base_begin, EXT_SYM, fname_base_end+1);

   return 0;
} 

I was able to get away with not allocating memory to build the string in because I let printf actually worry about building it, and took advantage of knowing that the original fname string would not be needed in the future.

I could have allocated the space for the string by calculating how long it would need to be based on the parts and then used sprintf to form the string for me.

Also, if you don't want to alter the contents of the fname string you could also have used:

printf("string=%s%c%*s_test%c%s\n", outdir, DIR_SYM, (unsigned)fname_base_begin -(unsigned)fname_base_end, fname_base_begin, EXT_SYM, fname_base_end+1);

To make printf only use part of the string.

nategoose