views:

552

answers:

3

Looking for a proven to work algorithm for production. Did see this example but not finding much else on the web or in books.

i.e. file_10.txt > file_2.txt

Thanks.

+3  A: 

The basic sort function would be standard C qsort(). It is parameterized to take a comparison function, and the comparison function is what you would need to write to do the natural ordering.

Your cross-referenced question includes a C implementation of a comparison function.

A Google search 'natural sort c' shows a SourceForge implementation.

Jonathan Leffler
I did see the source fourge example, I am on windows. Hoping for more of a well tested and used compare.
Tommy
+2  A: 

I assume you already know the C standard library qsort() function:

void qsort(void *base,
           size_t nel,
           size_t width,
           int (*compar)(const void *, const void *);

That last parameter is a function pointer, which means you can pass any function to it. You could use strcmp(), in fact, but that would give you ASCIIbetical, and you specifically want a natural sort.

In that case, you could write one pretty easily:

#include <ctype.h>

int natural(const char *a, const char *b)
{
    if(isalpha(*a) && isalpha(*b))
      {
        // compare two letters
      }
    else
      {
        if(isalpha(*a))
          {
            // compare a letter to a digit (or other non-letter)
          }
        else if(isalpha(*b))
          {
            // compare a digit/non-letter to a letter
          }
        else
          {
            // compare two digits/non-letters
          }
      }
}

Some of the elses could be cleared up if you just return early, but there's a basic structure. Check ctype.h for functions like isalpha() (if a character is part of the alphabet), isdigit(), isspace(), and more.

Chris Lutz
A nice idea, but a bit oversimplified. It's not clear why I bothered writing and testing code...
Norman Ramsey
It's hard to answer the question if we don't know what he means by "natural" and he ignored the (well designed and perfectly portable) version that was linked.
Chris Lutz
+4  A: 

Here is a (tested) comparison function that does the job. It understands only unsigned integers, not signed integers or floating point:

#include <stdlib.h>
#include <ctype.h>

/* like strcmp but compare sequences of digits numerically */
int strcmpbynum(const char *s1, const char *s2) {
  for (;;) {
    if (*s2 == '\0')
      return *s1 != '\0';
    else if (*s1 == '\0')
      return 1;
    else if (!(isdigit(*s1) && isdigit(*s2))) {
      if (*s1 != *s2)
        return (int)*s1 - (int)*s2;
      else
        (++s1, ++s2);
    } else {
      char *lim1, *lim2;
      unsigned long n1 = strtoul(s1, &lim1, 10);
      unsigned long n2 = strtoul(s2, &lim2, 10);
      if (n1 > n2)
        return 1;
      else if (n1 < n2)
        return -1;
      s1 = lim1;
      s2 = lim2;
    }
  }
}

If you want to use it with qsort, use this auxiliary function:

static int compare(const void *p1, const void *p2) {
  const char * const *ps1 = p1;
  const char * const *ps2 = p2;
  return strcmpbynum(*ps1, *ps2);
}

And you can do something on the order of

qsort(lines, next, sizeof(lines[0]), compare);
Norman Ramsey
I think the first "return 1" should be "return -1". Consider strcmpbynum ("", "x").
Darius Bacon