views:

113

answers:

4

I love the ideas presented in Brian Kernighan and Rob Pike's book, "The UNIX Programming Environment," where they focus on the point of working within an environment where you can put together many (small, precise, well understood) programs on the command line to accomplish many programming tasks.

I'm brushing up on strict ANSI C conventions and trying to stick to this philosophy. Somewhere in this book (I can get an exact page number if needed) they suggest that all programs in this environment should adhere to the following principles:

  1. If input is presented on the command line, as an argument to the program itself, process that input.

  2. If no input is presented on the command line, process input from stdin.

Here's a C program I wrote that will echo any input (numeric or alphabetic) that is a palindrome. My question specifically:

Is this a well behaved C program? In other words, is this what Kernighan and Pike were suggesting is the optimal behavior for a command line application like this?

#include <stdio.h>
#include <string.h> /* for strlen */

int main(int argc, char* argv[]) {

    char r_string[100];

    if (argc > 1) {

        int length = (int)strlen(argv[1]);

        int i = 0;
        int j = length;
        r_string[j] = (char)NULL;
        j--;
        for (i = 0; i < length; i++, j--) {
           r_string[j] = argv[1][i];
        }

        if (strcmp(argv[1], r_string) == 0) {
            printf("%s\n", argv[1]);
        }

    } else {

        char* i_string;

        while (scanf("%s", i_string) != EOF) {

            int length = (int)strlen(i_string);

            int i = 0;
            int j = length;
            r_string[j] = (char)NULL;
            j--;
            for (i = 0; i < length; i++, j--) {

               r_string[j] = i_string[i];
            }

            if (strcmp(i_string, r_string) == 0) {

                printf("%s\n", i_string); 
            }
        }
    }

    return 0;
}
+3  A: 

One problem that you have is a potential buffer overflow because you are writing an input of arbitrary length into a buffer with a fixed size. You can fix this by rejecting too long inputs or creating an array of the correct size dynamically. I would avoid using scanf.

Regarding the actual algorithm, you don't need to copy the string reversed and then compare the two strings. You could do the check using only a single copy of the string and a pointer at both ends, both moving in towards the middle.

Here is some code to show the principle:

char* a = /* pointer to first character in string */;
char* b = /* pointer to last character in string (excluding the null terminator) */;

while (a < b && *a == *b)
{
    a++;
    b--;
}

if (a >= b)
{
    // Is palindrome.
}

I agree with Javier that you factor the palindrome checking code out into a separate function.

Mark Byers
Excellent, I wasn't looking for improvements to the code itself, but now I'm intrigued. Can you supply some code to back this up? Thanks.
dvanaria
Regarding the buffer overflow: scanf can be used with a limit: `scanf("%1000s", buffer)`. This would limit the word length to 1000 characters, though. Allowing for arbitrary lengths would be more complicated (but possible). Additionally, the code reading the line to an uninitialized buffer (on the second part, when considering the stdin).
Hugo Peixoto
fgets is another alternative.
Mark Byers
If you consider a "word" as a sequence of non-space characters, I would use scanf. If you want to consider a whole line as a word, I would use fgets.
Hugo Peixoto
+1  A: 

Regarding the principles you specified, I believe that these tools usually take their arguments as filenames whose content is to be processed. Instead, you are treating them like the input itself.

Take sort, for example. If you don't specify any arguments, the contents from stdin will be sorted. Otherwise, the contents in the file whose filename you specified will be sorted. It is not the arguments themselves that are processed.

The code for this would be something along these lines:

FILE * input = stdin;
if (argc > 1)
{
  input = fopen(argv[1], "r");
  // handle possible errors from the fopen
}

while (fscanf(input, "%s", i_string) != EOF)
  // check if i_string is a palindrome and output to stdout

Also, you should be careful with the buffer overflow specified by Mark Byers.

You're not handling the string reading correctly. The i_string buffer is not initialized, and even if it were, you're should limit the number of bytes that scanf reads to avoid the mentioned overflow:

char i_string[1000];
while (scanf("999%s", i_string) != EOF)
  if (is_palindrome(i_string)) /* Use any function defined in the other answers */
    printf("%s\n", i_string);

You must always reserve one more byte (1000 vs 999) to account for the NULL string terminator. If you want to allow arbitrary length strings, I think you'll have to dinamically allocate the buffer, and resize it in case bigger strings are present. This would be slightly more complicated.

Hugo Peixoto
Well, I have to disagree. Providing a file to be processed as input (at the command line) is a function of the shell. You can redirect input using the < or | shell commands, but that only changes what "stdin" refers to. I may be wrong here, your comments are welcome. Thanks for the reply.
dvanaria
Well, this opinion is from experience with some of the unix tools. cat, sort, sed, grep and probably others receive the filename as a command argument. You can, indeed, pipe them in, but that's why you have the option to omit the arguments. I don't remember of any tool (although I agree that there may be many) that follow your convention.
Hugo Peixoto
Added some information on the buffer initialization and overflow.
Hugo Peixoto
+3  A: 

Yes, I think that you are following the R&K advice. As Hugo said, you could take the argumentas a filename, bu,t IMHO, for this simple program, I'd say that taking the parameter as the palindrome itself may make more sense.

Also, if you allow me extra advice, I would separate the functionality of reading a string from checking whether it is a palindrome or not, because you have that code duplicated right now.

int ispalindrome(const char* c) { 
   size_t len = strlen(c);
   size_t limit = len/2;
   size_t i;
   for (i = 0; i < limit; i++) {
     if(c[i]!=c[len-i-1]) break; /* Different character found */
   }
   return i==limit; /* If we reached limit, it's a palyndrome */
}

Of course, I am pretty sure this can be improved (it may even have a bug, I am typping quite fast), but once that you have your string, be either from command line or user input, you can call this function or a functiom like this.

NOTE: Edited to reflect comment from Mark, thanks a lot, Mark!

Javier
Right, I suspect I'm duplicating code here. It could be refactored to be more concise. I was going for readability - can you give some example code? I'm most interested in how a C program should handle input, given that input can be redirected at the command line, by the shell.
dvanaria
Added in the answer
Javier
Why can't an odd-length string be a palindrome? What about "racecar"?
Mark Byers
You are right, Mark. Thanks a lot! I must confess that I was thinking in palindromes in my native tongue, which is Spanish, and I was not able to come with one that was not even :) Thanks a lot again!
Javier
+1  A: 

It is useful for text filters such as a program that prints only lines with palindromes to specify input files via command line arguments e.g., it allows:

$ palindromes input*.txt # file patterns
$ find -name '*.txt' -print0 | xargs -0 palindromes

It is common convention that is supported by many languages. Below are scripts in Perl, Python, C that has the same usage:

Usage: palindromes [FILE]
Print lines that are polindromes in each FILE.

With no FILE, or when FILE is -, read standard input.

in Perl

#!/usr/bin/perl -w
while (<>) { # read stdin or file(s) specified at command line
    $line = $_;
    s/^\s+//; # remove leading space
    s/\s+$//; # remove trailing space
    print $line if $_ eq reverse $_; # print line with a palindrome
}

in Python

#!/usr/bin/env python
import fileinput, sys

for line in fileinput.input(): # read stdin or file(s) specified at command line
    s = line.strip()    # strip whitespace characters
    if s == s[::-1]: # is palindrome
        sys.stdout.write(line)

in C

#!/usr/local/bin/tcc -run -Wall
#include <ctype.h>
#include <errno.h>
#include <stdbool.h>
#include <stdio.h>
#include <string.h>

enum { 
  MATCH,
  NO_MATCH,
  ERROR 
};

bool is_palindrome(char *first, char *last) {
  /** Whether a line defined by range [first, last) is a palindrome.

      `last` points either to '\0' or after the last byte if there is no '\0'.
      Leading and trailing spaces are ignored.
      All characters including '\0' are allowed
  */
  --last; // '\0'
  for ( ; first < last && isspace(*first); ++first); // skip leading space
  for ( ; first < last && isspace(*last); --last);   // skip trailing space
  for ( ; first < last; ++first, --last)
    if (*first != *last)
      return false;
  return true;
}

int palindromes(FILE *fp) {
  /** Print lines that are palindromes from the file.

      Return 0 if any line was selected, 1 otherwise;
      if any error occurs return 2
   */
  int ret = NO_MATCH;

  char *line = NULL;
  size_t line_size = 0; // line size including terminating '\0' if any
  ssize_t len = -1;     // number of characters read, including '\n' if any,
                        // . but not including the terminating '\0'
  while ((len = getline(&line, &line_size, fp)) != -1) {
    if (is_palindrome(line, line + len)) {
      if (printf("%s", line) < 0) {
        ret = ERROR;
        break;
      }
      else
        ret = MATCH;
    }
  }
  if (line)
    free(line);
  else 
    ret = ERROR;

  if (!feof(fp))
    ret = ERROR;
  return ret;
}

int main(int argc, char* argv[]) {
  int exit_code = NO_MATCH;
  if (argc == 1) // no input file; read stdin
    exit_code = palindromes(stdin);
  else {
    // process each input file
    FILE *fp = NULL;
    int ret = 0;
    int i;
    for (i = 1; i < argc; i++) { 
      if (strcmp(argv[i], "-") == 0)
        ret = palindromes(stdin);
      else if ((fp = fopen(argv[i], "r")) != NULL) {
        ret = palindromes(fp);
        fclose(fp);
      } else {
        fprintf(stderr, "%s: %s: could not open: %s\n",
                argv[0], argv[i], strerror(errno));
        exit_code = ERROR;
      }
      if (ret == ERROR) {
        fprintf(stderr, "%s: %s: error: %s\n",
                argv[0], argv[i], strerror(errno));
        exit_code = ERROR;
      } else if (ret == MATCH && exit_code != ERROR) 
        // return MATCH if at least one line is a MATCH, propogate error
        exit_code = MATCH;
    }
  }
  return exit_code;
}

Exit status is 0 if any line was selected, 1 otherwise; if any error occurs, the exit status is 2. It uses GNU getline() that allows arbitrary large lines as an input.

J.F. Sebastian