tags:

views:

110

answers:

3

I've written a program called Mathtext. This program gives plain text "style" by shifting certain character ranges into Unicode ranges such as 'mathematical letterlike symbols" to produce plain-text italics, bold, serif, etc.

It works as a line-by-line interpreter, like a shell, outputting the translated line after a line is entered. This means that files can be cat/piped in to translate an entire file, as well as the fact that you can 'exit' the 'shell' by pressing ^D, which is detected by stdin hitting EOF.

Everything works. However, when I press ^D and exit, it segfaults. I still can't quite grasp what is causing this.

Compiling with -g -O0 helps a little; I now know that the problem arises from a strlen call in transpose when ^D is pressed. However, transpose should never be called during ^D, as eof is true!

Program received signal SIGSEGV, Segmentation fault.
__strlen_sse2 () at ../sysdeps/x86_64/multiarch/../strlen.S:31
31    ../sysdeps/x86_64/multiarch/../strlen.S: No such file or directory.
    in ../sysdeps/x86_64/multiarch/../strlen.S
(gdb) where
#0  __strlen_sse2 () at ../sysdeps/x86_64/multiarch/../strlen.S:31
#1  0x0000000000400b0e in transpose (s=0x0, capsDelta=120263, smallDelta=120257, numDelta=0) at mathtext.c:58
#2  0x0000000000400e2b in main (argc=2, argv=0x7fffffffe4b8) at mathtext.c:92
+3  A: 

Your program is dereferencing NULL, since fgets returns NULL on error or EOF, and you're passing this directly to transpose which uses the result naively.

Hasturkun
+1  A: 

feof can't predict the future. It doesn't know it's the end of file until you actually press the ^D key, by which time your program is back waiting for input in fgets. Reading a file wouldn't produce the error because all the input is already present at the beginning. Check for NULL in your transpose function.

Karl Bielefeldt
An easier way to say this is: EOF is not set until you read past the EOF. The last valid read will read UP-TO but not past EOF and thus the read that fails will have exactly 0 bytes to read.
Martin York
+3  A: 

Most uses of feof() are a bug - and this program demonstrates it perfectly in this main loop:

char temp[1048576];
do {
    if (!strcmp(argv[1], "serifb"))
        transpose(fgets(temp, 1048576, stdin), 119808 - 'A', 119834 - 'a', 120782 - '0');
    else if (!strcmp(argv[1], "serifi"))
        transpose(fgets(temp, 1048576, stdin), 119860 - 'A', 119886 - 'a', 0);
    else if (!strcmp(argv[1], "serifbi"))
        transpose(fgets(temp, 1048576, stdin), 119912 - 'A', 119938 - 'a', 0);
    else if (!strcmp(argv[1], "sans"))
        transpose(fgets(temp, 1048576, stdin), 120224 - 'A', 120250 - 'a', 120802 - '0');
    else if (!strcmp(argv[1], "sansb"))
        transpose(fgets(temp, 1048576, stdin), 120276 - 'A', 120302 - 'a', 120812 - '0');
    else if (!strcmp(argv[1], "sansi"))
        transpose(fgets(temp, 1048576, stdin), 120328 - 'A', 120354 - 'a', 0);
    else if (!strcmp(argv[1], "sansbi"))
        transpose(fgets(temp, 1048576, stdin), 120380 - 'A', 120406 - 'a', 0);
    else if (!strcmp(argv[1], "mono"))
        transpose(fgets(temp, 1048576, stdin), 120432 - 'A', 120458 - 'a', 120822 - '0');
    else if (!strcmp(argv[1], "fullwidth"))
        transposeBlock(fgets(temp, 1048576, stdin), '!', '~', 65281 - '!');
    else return help();
} while(!feof(stdin));

At end-of-file, fgets() will return NULL, and then the next invocation of feof() will return true. So the correct approach is to test the return value of your input function - and since you're doing that test anyway, there's no need to call feof() (unless you want to distinguish a file error from end-of-file).

char temp[1048576];
while (fgets(temp, sizeof temp, stdin) != NULL) {
    if (!strcmp(argv[1], "serifb"))
        transpose(temp, 119808 - 'A', 119834 - 'a', 120782 - '0');
    else if (!strcmp(argv[1], "serifi"))
        transpose(temp, 119860 - 'A', 119886 - 'a', 0);
    else if (!strcmp(argv[1], "serifbi"))
        transpose(temp, 119912 - 'A', 119938 - 'a', 0);
    else if (!strcmp(argv[1], "sans"))
        transpose(temp, 120224 - 'A', 120250 - 'a', 120802 - '0');
    else if (!strcmp(argv[1], "sansb"))
        transpose(temp, 120276 - 'A', 120302 - 'a', 120812 - '0');
    else if (!strcmp(argv[1], "sansi"))
        transpose(temp, 120328 - 'A', 120354 - 'a', 0);
    else if (!strcmp(argv[1], "sansbi"))
        transpose(temp, 120380 - 'A', 120406 - 'a', 0);
    else if (!strcmp(argv[1], "mono"))
        transpose(temp, 120432 - 'A', 120458 - 'a', 120822 - '0');
    else if (!strcmp(argv[1], "fullwidth"))
        transposeBlock(temp, '!', '~', 65281 - '!');
    else return help();
}
caf
Thank you very much, your solution worked perfectly!
Delan Azabani