Okay, now this is more a rant about Linux than a question, but maybe someone knows how to do what I want. I know this can be achieved using the sort
command, but I want a better solution because getting that to work is about as easy as writing a C program to do the same thing.
I have files, for arguments sake, lets say I have these files: (my files are the same I just have many more)
- file-10.xml
- file-20.xml
- file-100.xml
- file-k10.xml
- file-k20.xml
- file-k100.xml
- file-M10.xml
- file-M20.xml
- file-M100.xml
Now this turns out to be the order I want them sorted in. Incidentally, this is the order in Windows that they are by default sorted into. That's nice. Windows groups consecutive numerical characters into one effective character which sorts alphabetically before letters.
If I type ls
at the linux command line, I get the following garbage. Notice the 20 is displaced. This is a bigger deal when I have hundreds of these files that I want to view in a report, in order.
- file-100.xml
- file-10.xml
- file-20.xml
- file-k100.xml
- file-k10.xml
- file-k20.xml
- file-M100.xml
- file-M10.xml
- file-M20.xml
I can use ls -1 | sort -n -k 1.6
to get the ones without 'k' or 'M' correct...
- file-k100.xml
- file-k10.xml
- file-k20.xml
- file-M100.xml
- file-M10.xml
- file-M20.xml
- file-10.xml
- file-20.xml
- file-100.xml
I can use ls -1 | sort -n -k 1.7
to get none of it correct
- file-100.xml
- file-10.xml
- file-20.xml
- file-k10.xml
- file-M10.xml
- file-k20.xml
- file-M20.xml
- file-k100.xml
- file-M100.xml
Okay, fine. Let's really get it right. ls -1 | grep "file-[0-9]*\.xml" | sort -n -k1.6 && ls -1 file-k*.xml | sort -n -k1.7 && ls -1 file-M*.xml | sort -n -k1.7
- file-10.xml
- file-20.xml
- file-100.xml
- file-k10.xml
- file-k20.xml
- file-k100.xml
- file-M10.xml
- file-M20.xml
- file-M100.xml
Whew! Boy glad the "power of the linux command line" saved me there. (This isn't practical for my situation, because instead of ls -1
I have a command that is another line or two long)
Now, the Windows behavior is simple, elegant, and does what you want it to do 99% of the time. Why can't I have that in linux? Why oh why does sort
not have a "automagic sort numbers in a way that doesn't make me bang head into wall" switch?
Here's the pseudo-code for C++:
bool compare_two_strings_to_avoid_head_injury(string a, string b)
{
string::iterator ai = a.begin();
string::iterator bi = b.begin();
for(; ai != a.end() && bi != b.end(); ai++, bi++)
{
if (*ai is numerical)
gobble up the number incrementing ai past numerical chars;
if (*bi is numerical)
gobble up the number incrementing bi past numerical chars;
actually compare *ai and *bi and/or the gobbled up number(s) here
to determine if we need to compare more chars or can return the
answer now;
}
return something here;
}
Was that so hard? Can someone put this in sort and send me a copy? Please?