I've got a large set of data for which computing the sort key is fairly expensive. What I'd like to do is use the DSU pattern where I take the rows and compute a sort key. An example:
Qty Name Supplier
Row 1: 50 Widgets IBM
Row 2: 48 Thingies Dell
Row 3: 99 Googaws IBM
To sort by Quantity and Supplier I could have the sort keys: 0050 IBM
, 0048 Dell
, 0099 IBM
. The numbers are right-aligned and the text is left-aligned, everything is padded as needed.
If I need to sort by the Quanty in descending order I can just subtract the value from a constant (say, 10000) to build the sort keys: 9950 IBM
, 9952 Dell
, 9901 IBM
.
How do I quickly/cheaply build a descending key for the alphabetic fields in C#?
[My data is all 8-bit ASCII w/ISO 8859 extension characters.]
Note: In Perl, this could be done by bit-complementing the strings:
$subkey = $string ^ ( "\xFF" x length $string );
Porting this solution straight into C# doesn't work:
subkey = encoding.GetString(encoding.GetBytes(stringval).
Select(x => (byte)(x ^ 0xff)).ToArray());
I suspect because of the differences in the way that strings are handled in C#/Perl. Maybe Perl is sorting in ASCII order and C# is trying to be smart?
Here's a sample piece of code that tries to accomplish this:
System.Text.ASCIIEncoding encoding = new System.Text.ASCIIEncoding();
List<List<string>> sample = new List<List<string>>() {
new List<string>() { "", "apple", "table" },
new List<string>() { "", "apple", "chair" },
new List<string>() { "", "apple", "davenport" },
new List<string>() { "", "orange", "sofa" },
new List<string>() { "", "peach", "bed" },
};
foreach(List<string> line in sample)
{
StringBuilder sb = new StringBuilder();
string key1 = line[1].PadRight(10, ' ');
string key2 = line[2].PadRight(10, ' ');
// Comment the next line to sort desc, desc
key2 = encoding.GetString(encoding.GetBytes(key2).
Select(x => (byte)(x ^ 0xff)).ToArray());
sb.Append(key2);
sb.Append(key1);
line[0] = sb.ToString();
}
List<List<string>> output = sample.OrderBy(p => p[0]).ToList();
return;