ansaurus

Question

K & R Exercise: My Code Works, But Feels Stinky; Advice for Cleanup?

Answer 1

+7 A:

If you are sticking with chapter 1, that looks pretty good to me. Here's what I would recommend from a code-review standpoint:

When checking equality in C, always put the constant first

if (1 == myvar)

That way you will never accidentally do something like this:

if (myvar = 1)

You can't get away with that in C#, but it compiles fine in C and can be a real devil to debug.

Eric Z Beard 2008-10-02 11:56:01

Whoa, lots of downvotes... haven't had much coffee yet, what's wrong with that?

Eric Z Beard 2008-10-02 12:00:46

no reason to downvote perfectly valid advice, i personally do not bother to do sow but any way perfectly valid voting up.

Ilya 2008-10-02 12:03:47

I guess some people don't like the (1 == x) syntax. I'm one of them, but I don't feel that it's worth a downvote.

aib 2008-10-02 12:04:33

That syntax is part of the MISRA standard. I like it but I know that the guy I sit next to hates it!

Klelky 2008-10-02 12:06:46

Really? What's wrong with that syntax? As soon as I adopted it I stopped wasting so much time tracking down the missing = sign.

Eric Z Beard 2008-10-02 12:07:11

It makes code less readable. A better solution is to set the compiler warning level high enough to flag it as a warning.

Ferruccio 2008-10-02 12:09:54

I agree it is less readable, less natural when you say "if 1 equals myvar". I didn't realize there was a compiler flag, I'd be happier with that.

Eric Z Beard 2008-10-02 12:11:39

I am going to downvote it again: 1==var it has nothing to do with the "uglines" that the question refers to. I do hate this convention - it make me do the work for compiler, but I am downvoting it as irrelevant and off-topic.

Arkadiy 2008-10-02 13:26:24

+8/-5 so far. Wow, I didn't think this would be so polarizing.

Eric Z Beard 2008-10-02 13:31:03

+1 relevant, useful, on-topic...unless you don't like writing maintainable C.

sixlettervariables 2008-10-02 17:51:00

Answer 2

+3 A:

There is no reason to have two buffers, you can trim the input line in place

int trim(char line[])
{
    int len = 0;
    for (len = 0; line[len] != 0; ++len)
     ;

    while (len > 0 &&
           line[len-1] == ' ' && line[len-1] == '\t' && line[len-1] == '\n')
     line[--len] = 0;

    return len;
}

By returning the line length, you can eliminate blank lines by testing for non-zero length lines

if (trim(line) != 0)
    printf("%s\n", line);

EDIT: You can make the while loop even simpler, assuming ASCII encoding.

while (len > 0 && line[len-1] <= ' ')
    line[--len] = 0;

Ferruccio 2008-10-02 12:06:31

This is the kind of idea I'm looking for ... But I plugged it in and played with it a bit, and instead of trimming end spaces and blank lines, it's actually inserting extra blank lines. :)

John Rudy 2008-10-02 13:17:28

That's what happens when you type in code first thing in the morning without actually checking it :-)

Ferruccio 2008-10-02 14:18:43

I wrote != instead of == in the original while loop.

Ferruccio 2008-10-02 14:22:46

:) I'm working with this now, on my lunch break ... I'm still having some glitches, but I think I can work those out on my own ... I think this may well be the answer I accept!

John Rudy 2008-10-02 16:42:34

Answer 3

+1 A:

Personally for while constructs:

I prefer the following:

while( (ret[i] = line[i]) )
        i++;

to:

while ((ret[i] = line[i]) != '\0')
        ++i;

They both check against != 0 but the first looks a little cleaner. If the char is anything other thah 0, then the loop body will execute else it will break out of the loop.

Also for 'for' statements, whilst being syntatically valid, I find that the following:

for (  ; i >= 0; --i)

just looks 'odd' to me and indeed is a potential nightmare solution for potential bugs. If I was reviewing this code, it would be like a glowing red warning like. Typically you want to use for loops for iterating a known number of times, otherwise cosider a while loop. (as always there are exceptions to the rule but Ive found that this generally holds true). The above for statement could become:

while (i)
{
        if (ret[i] == ' ' || ret[i] == '\t')
        {
            ret[i--] = '\0';
        }
        else if (ret[i] != '\0' && ret[i] != '\r' && ret[i] != '\n')
        {
            break;
        }
}

TK 2008-10-02 12:19:00

Erm, you mean "while (ret[i] = line[i])"..

aib 2008-10-02 12:27:55

Thanks for the advice on replacing the for loop with a while construct.

John Rudy 2008-10-02 13:07:21

Answer 4

A:

First of all:

int main(void)

You know the parameters to main(). They're nothing. (Or argc&argv, but I don't think that's Chapter 1 material.)

Stylewise, you might want to try K&R-style brackets. They're much easier on the vertical space:

void trim(char line[], char ret[])
{
    int i = 0;

    while ((ret[i] = line[i]) != '\0')
        ++i;

    if (i == 1) { // Special case to remove entirely blank line
        ret[0] = '\0';
        return;
    }

    for (; i>=0; --i) { //continue backwards from the end of the line
        if ((ret[i] == ' ') || (ret[i] == '\t')) //remove trailing whitespace
            ret[i] = '\0';

        else if ((ret[i] != '\0') && (ret[i] != '\r') && (ret[i] != '\n')) //...until we hit a word character
            break;
    }

    for (i=0; i<MAXLINE-1; ++i) { //-1 because we might need to add a character to the line
        if (ret[i] == '\n') //break on newline
            break;

        if (ret[i] == '\0') { //line doesn't have a \n -- add it
            ret[i] = '\n';
            ret[i+1] = '\0';
            break;
        }
    }
}

(Also added comments and fixed one bug.)

A big issue is the usage of the MAXLINE constant -- main() exclusively uses it for the line and out variables; trim(), which is only working on them doesn't need to use the constant. You should pass the size(s) as a parameter just like you did in getline().

aib 2008-10-02 12:26:23

Answer 5

+1 A:

trim() is too big.

What I think you need is a strlen-ish function (go ahead and write it int stringlength(const char *s)).

Then you need a function called int scanback(const char *s, const char *matches, int start) which starts at start, goes down to z as long as the character being scanned at s id contained in matches, return the last index where a match is found.

Then you need a function called int scanfront(const char *s, const char *matches) which starts at 0 and scans forward as long as the character being scanned at s is contained in matches, returning the last index where a match is found.

Then you need a function called int charinstring(char c, const char *s) which returns non-zero if c is contained in s, 0 otherwise.

You should be able to write trim in terms of these.

plinth 2008-10-02 12:39:36

Answer 6

A:

Here's my stab at the exercise without knowing what is in Chapter 1 or K & R. I assume pointers?

#include "stdio.h"

size_t StrLen(const char* s)
{
    // this will crash if you pass NULL
    size_t l = 0;
    const char* p = s;
    while(*p)
    {
        l++;
        ++p;
    }
    return l;
}

const char* Trim(char* s)
{
    size_t l = StrLen(s);
    if(l < 1)
        return 0;

    char* end = s + l -1;
    while(s < end && (*end == ' ' || *end == '\t'))
    {
        *end = 0;
        --end;
    }

    return s;
}

int Getline(char* out, size_t max)
{
    size_t l = 0;
    char c;
    while(c = getchar())
    {
        ++l;

        if(c == EOF) return 0;
        if(c == '\n') break;

        if(l < max-1)
        {
            out[l-1] = c;
            out[l] = 0;
        }
    }

    return l;
}

#define MAXLINE 1024

int main (int argc, char * const argv[]) 
{
    char line[MAXLINE];
    while (Getline(line, MAXLINE) > 0)
    {
        const char* trimmed = Trim(line);
        if(trimmed)
            printf("|%s|\n", trimmed);

        line[0] = 0;
    }

    return 0;
}

orj 2008-10-02 13:21:41

uh, that looks dangerous. What happens if some one calls Trim(" "); You are going to read memory that is outside of the string. And with some bad luck you are going to write on that memory, too.

quinmars 2008-10-02 14:57:32

There may be bugs in that code. I didn't test it super thoroughly. You are right. The while loop condition in Trim() should also be testing that end is greater than s. Assuming strings grow up in memory addresses.

orj 2008-10-17 11:13:13

Answer 7

A:

Personally I'd put code like this:

ret[i] != '\0' && ret[i] != '\r' && ret[i] != '\n'

into a separate function (or even a define macro)

ilitirit 2008-10-02 13:29:30

Answer 8

A:

trim should indeed use 1 buffer only (as @Ferruccio says).
trim needs to be broken up, as @plinth says
trim needs not return any value (if you want to check for empty string, test line[0] == 0)
for extra C flavor, use pointers rather than indexes

-go to end of line (terminating 0; -while not at the start of line and current character is space, replace it with 0. -back off one char

char *findEndOfString(char *string) {
  while (*string) ++string;
  return string; // string is now pointing to the terminating 0
}

void trim(char *line) {
  char *end = findEndOfString(line);
   // note that we start at the first real character, not at terminating 0
  for (end = end-1; end >= line; end--) {
      if (isWhitespace(*end)) *end = 0;
      else return;
  }
}

Arkadiy 2008-10-02 13:42:40

Answer 9

+2 A:

First off, thanks to everyone for the advice and help. Ferruccio got me closest, and here's how it ended up:

#include <stdio.h>

#define MAXLINE 1000

int getline(char line[], int max);
int trim(char line[]);

int main(void)
{
    char line[MAXLINE];
    int length;

    while ((length = getline(line, MAXLINE)) > 0)
    {
        if (trim(line) != 0)
            printf("%s\n", line);
    }

    return 0;
}

int getline(char line[], int max)
{
    int c, i;

    for (i = 0; i < max - 1 && (c = getchar()) != EOF && c != '\n'; ++i)
        line[i] = c;

    if (c == '\n')
    {
        line[i] = c;
        ++i;
    }

    line[i] = '\0'; 
    return i;
}

int trim(char line[])
{
    int len = 0;
    for (len = 0; line[len] != '\0'; ++len)
        ;

    while (len > 0 && (line[len-1] == ' ' || line[len-1] == '\t' || line[len-1] == '\n'))
    {
        line[--len] = '\0';
    }

    return len;
}

John Rudy 2008-10-02 17:47:55

Answer 10

A:

Another example of doing the same thing. Did some minor violation by using C99-specific stuff. that will not be found in K&R. also used the assert() function which is part of the starndard library, but is probably not covered in chapter one of K&R.

#include <stdbool.h> /* needed when using bool, false and true. C99 specific. */
#include <assert.h> /* needed for calling assert() */

typedef enum {
  TAB = '\t',
  BLANK = ' '
} WhiteSpace_e;

typedef enum {
  ENDOFLINE = '\n',
  ENDOFSTRING = '\0'
} EndofLine_e;

bool isWhiteSpace(
  char character
) {
  if ( (BLANK == character) || (TAB == character ) ) {
    return true;
  } else {
    return false;
  }
}

bool isEndOfLine( 
  char character
) {
 if ( (ENDOFLINE == character) || (ENDOFSTRING == character ) ) {
    return true;
  } else {
    return false;
  }
}   

/* remove blanks and tabs (i.e. whitespace) from line-string */
void removeWhiteSpace(
  char string[]
) {
  int i;
  int indexOutput;

  /* copy all non-whitespace character in sequential order from the first to the last.
    whitespace characters are not copied */
  i = 0;
  indexOutput = 0;
  while ( false == isEndOfLine( string[i] ) ) {
    if ( false == isWhiteSpace( string[i] ) ) {
      assert ( indexOutput <= i );
      string[ indexOutput ] = string[ i ];
      indexOutput++;
    }
    i++; /* proceed to next character in the input string */
  }

  assert( isEndOfLine( string[ i ] ) );
  string[ indexOutput ] = ENDOFSTRING;

}

2008-10-02 19:03:06

ansaurus

tags:

views:

answers:

K & R Exercise: My Code Works, But Feels Stinky; Advice for Cleanup?

related questions