tags:

views:

178

answers:

2

[Solved] Writing parsing code is a trap. A line with 15 spaces will have 15 words. Blank lines will also count as a word. Back to flex and bison for me.

#include        <stdio.h>                         
#include        <stdlib.h>                        

int     main(int argc, char     *argv[]) {

        FILE    *fp = NULL;
        int             iChars =0, iWords =0, iLines =0;
        int             ch;                             

        /* if there is a command line arg, then try to open it as the file
                otherwise, use stdin */                                   

        fp = stdin;
        if (argc == 2) {
                fp = fopen(argv[1],"r");
                if (fp == NULL) {       
                        fprintf(stderr,"Unable to open file %s. Exiting.\n",argv[1]);
                        exit(1);
                }
        }

        /* read until the end of file, counting chars, words, lines */
        while ((ch = fgetc(fp)) != EOF) {
                if (ch == '\n') {
                        iWords++;
                        iLines++;
                }

                if (ch == '\t' || ch == ' ') {
                        iWords++;
                }

                iChars++;
        }

        /* all done. If the input file was not stdin, close it*/
        if (fp != stdin) {
                fclose(fp);
        }

        printf("chars: %d,\twords: %d,\tlines: %d.\n",iChars,iWords,iLines);
}

TEST DATA foo.sh

#!/home/ojblass/source/bashcrypt/a.out
This is line 1
This is line 2
This is line 3

ojblass@linux-rjxl:~/source/bashcrypt> wc foo.sh

5 13 85 foo.sh

ojblass@linux-rjxl:~/source/bashcrypt> a.out foo.sh

chars: 85, words: 14, lines: 5.

+2  A: 

You are counting \n as a word even for a blank line.

Kevin Beck
Thank you clear indication of bed time!
ojblass
There was a blank at the end.
ojblass
+3  A: 

Your algorithm is wrong. If you have in the test file 2 blank characters in succession the counter for words will be incremented twice, but it should be incremented only once.

A solution will be to remember last character read. If the character read is a special character (blank, new line, ...) and the previous character is an alphanumeric then you increment the counter for words.

Ionel Bratianu
Also true... not my best work.
ojblass
I should stick to flex and bison. I never learn my lesson.
ojblass