views:

744

answers:

6

I have a tab-delimited text file that I am parsing. Its first column contains strings of the format chrX, where X denotes a set of strings, e.g., "1", "2", ..., "X", "Y".

These are each stored in a char* called chromosome, as the file is parsed.

The text file is sorted on the first column lexicographically, i.e., I will have a number of rows starting with "chr1", and then "chr2", etc.

At each "chrX" entry, I need to open another file that is associated with this entry:

FILE *merbaseIn;

// loop through rows...

if (chromosome == NULL)                                                                                                                                                   
    openSourceFile(&chromosome, fieldArray[i], &merbaseIn, GENPATHIN);                                                                                                      
else {                                                                                                                                                                    
    if (strcmp(chromosome, fieldArray[i]) != 0) { // new chromosome                                                                                                   
        fclose(merbaseIn); // close old chromosome FILE ptr                                                                                                                                                                                                                                    
        free(chromosome); // free old chromosome ptr                                                                                                                          
        openSourceFile(&chromosome, fieldArray[i], &merbaseIn, GENPATHIN); // set up new chromosome FILE ptr                                                                  
    }                                                                                                                                                                       
}  
// parse row

I have the function openSourceFile that is defined as follows:

void openSourceFile (char** chrome, const char* field, FILE** filePtr, const char *path) {
    char filename[100];                                                                                                                                                           
    *chrome = (char *) malloc ((size_t) strlen(field));
    if (*chrome == NULL) {                                                                                                                                                        
        fprintf(stderr, "ERROR: Cannot allocate memory for chromosome name!");                                                                                                      
        exit(EXIT_FAILURE);                                                                                                                                                         
    }                                                                                                                                                                             

    strcpy(*chrome, field);                                                                                                                                                       
    sprintf(filename,"%s%s.fa", path, field);                                                                                                                                     

    *filePtr = fopen(filename, "r");                                                                                                                                              
    if (*filePtr == NULL) {                                                                                                                                                       
        fprintf(stderr, "ERROR: Could not open fasta source file %s\n", filename);                                                                                                  
        exit(EXIT_FAILURE);                                                                                                                                                         
    }                                                                                                                                                                             
}

The problem is that my application quits with a Segmentation Fault going from the first chromosome to the second (from chr1 to chr2) at the following line, where I close the first chromosome file that I opened:

fclose(merbaseIn);

I know I'm not passing fclose a NULL pointer, because up until the Segmentation Fault, I am reading data from this file. I can even wrap this in a conditional and I still get the Fault:

if (merbaseIn != NULL) {
    fclose(merbaseIn);
}

Further, I know openSourceFile works (at least for chr1, when setting up the first file handle of FILE*) because my application parses chr1 rows and reads data from the FILE* source file correctly.

What is it about this fclose call that is causing a Segmentation Fault to occur?

+4  A: 

one error i notice is this line:

 *chrome = (char *) malloc ((size_t) strlen(field));

which should be:

 *chrome = (char *) malloc ((size_t) strlen(field)+1);

This is because a string has a closing 0 at the end which you also have to make room for

Toad
True, but strlen() returns size_t, so drop that cast. And malloc() returns void *, which is convertible to any other (data) pointer type in C, so drop that cast too.
unwind
In this case, you can use strdup() to replace malloc()+strcpy().
I disagree with the (char *) cast which clearly states the intention.
Toad
Thanks for pointing me to this error.
Alex Reynolds
+1  A: 
valgrind --db-attach=yes --leak-check=yes --tool=memcheck --num-callers=16 --leak-resolution=high ./yourprogram args

It's very likely the segfault is caused by memory corruption on the heap, not anything that's affecting locals. Valgrind will immediately show you the first wrong access you make.

wrang-wrang
I was not running `free` on the pointers within the double pointer. I was also not freeing pointers elsewhere. I scrapped this app and wrote it from scratch, running `valgrind` as I went along to make sure I was managing memory properly. Thanks for pointing me to this tool!
Alex Reynolds
Yeah, valgrind taught me: 1) how everyone was right and I should have only used smart (auto/refcount etc.) pointers or garbage collection and 2) with valgrind I don't actually need garbage collection. The only caveat is test coverage. I think it's easier to find memory leaks w/ valgrind than with any tool for GC languages (e.g. JVM or .NET languages) that I'm familiar with.
wrang-wrang
A: 

Best guess is that some other part of your code is corrupting memory through a buffer overrun or similar bug.

Although unlikely to be the cause, you have a potential overrun condition in your filename array when the full filename exceeds 100 chars.

I would suggest using a debugger to watch for changes in the memory location used by the merbaseIn variable.

Paul Dixon
A: 

Generic Pointer Problem

C is a great language but it does require that you not clobber your own memory. Besides the problem previously noted where the malloc is 1 byte short, you may have other pointer problems.

I would suggest using a memory debugger. In the past, Electric Fence was rather popular but these days I hear more about valgrind. There are lots of other choices.

DigitalRoss
A: 

In addition to error found by reinier, I suspect that:

free(chromosome);

should be followed by:

chromosome = NULL;

in order to prevent potential usage of a no longer valid value.

mouviciel
A: 

why this FILE** filePtr if only this FILE* filePtr would be enough ? just a thought...

Arabcoder