views:

130

answers:

6

Hi, I would like to ask if there is a more elegant way to do this:

List<char> unallowed = new List<char>();

for (char c = '\u0000'; c <= '\u0008'; c++) {
    unallowed.Add(c);
}

for (char c = '\u000B'; c <= '\u000C'; c++) {
    unallowed.Add(c);
}

// And so on...

I have to add to the list a few contiguous ranges of Unicode characters and the only thing that I can think of to refactor the code above is to create my own method to avoid typing the for cycles repeatedly. And I'm not even too sure it's worth it.

A: 

It would make sense to have the disallowed Unicode characters in a list which you read from a file or internal resource, rather than hard-coded in the application.

Charlie Salts
In .NET the char data type is 16 bit Unicode character.
Guffa
Richard Berg
@Charlie, in .Net, a Char is a 16-bit Unicode character.
Eric J.
Not in C#, see http://msdn.microsoft.com/en-us/library/system.char.aspx
Yeah yeah, my bad. I was thinking of byte. Edited to reflect the sobering comments :)
Charlie Salts
A: 

That's pretty much how I create lists of Char (actually did that just yesterday). If you have a lot of ranges to add to the list, you could make it a bit easier/less repetitive by defining a method such as AddUnallowed(char from, char to) that adds to the list.

Eric J.
+4  A: 

Well, you could do something like:

    List<char> chars = new List<char>();
    chars.AddRange(Enumerable.Range(0x0000, 9).Select(i => (char)i));
    chars.AddRange(Enumerable.Range(0x000B, 2).Select(i => (char)i));

Not sure it is worth it, though - especially given the need to use "count" rather than "end". Probably easier to write your own extension method...

static void AddRange(this IList<char> list, char start, char end) {
    for (char c = start; c <= end; c++) {
        list.Add(c);
    }
}
static void Main() {
    List<char> chars = new List<char>();
    chars.AddRange('\u0000', '\u0008');
    chars.AddRange('\u000B', '\u000C');
}


Re your comment; extension methods aren't a .NET 3.5 feature. They are a C# 3.0 feature. So as long as you compile the code set to target .NET 2.0 / 3.0 (as appropriate), it doesn't matter if the client doesn't have .NET 3.5; you do, however, need to defined the ExtensionAttribute - a few lines of code only:

namespace System.Runtime.CompilerServices
{
    [AttributeUsage(AttributeTargets.Assembly |
        AttributeTargets.Class | AttributeTargets.Method)]
    public sealed class ExtensionAttribute : Attribute { }
}

Or just go for broke and download LINQBridge and use all of LINQ-to-Objects in .NET 2.0.

Marc Gravell
@Marc: Nice alternative. Just want to point out that this is a .Net 3.5 feature. It's not a good choice for me personally as many of my applications have to run in environments (usually large, corporate ones) where 3.5 is not necessarily common and requesting an upgrade can be difficult.
Eric J.
"Just want to point out that this is a .Net 3.5 feature" - no, it isn't; see update.
Marc Gravell
+1  A: 

Adding a method to add the range is probably the simplest refactoring, and I think it would be worth it, just because it makes the ranges themselves easier to read. Using MiscUtil's Range class you could do something like:

list.AddRange('\u000b'.To('\u000c').Step(1))

but that would still be less clear than having an extra method (possibly an extension method on List<char>?) and writing:

list.AddCharRange('\u000b', '\u000c');

The extra cruft is okay for one or two calls, but if you're repeating this a number of times you really want to get rid of as much extraneous text as possible, to make the useful data stand out. It's a shame that extension methods aren't considered by collection initializers, as otherwise that would make a really neat solution.

Do you definitely need a List<char> though due to other restrictions? This sounds like you really want a Predicate<char> to say whether or not a character is allowed - and that could be implemented by combining ranges etc.

Jon Skeet
A: 

You can put the ranges in an array and loop through:

char[] ranges = {
   '\u0000','\u0008',
   '\u000b','\u000c',
   '0','9',
   'a','z'
};

for (int i = 0; i < ranges.Length; i++) {
   for (char c = ranges[i++]; c <= ranges[i]; c++) {
      unallowed.Add(c);
   }
}
Guffa
A: 

In your code is some duplications, as you have already recognized. And duplication is usually bad and a method would make your code more readable, so it think it's worth it. What about an extension method:

public static class YourHelper
{
    public static void AddCharRange(this List<char> list, char first, char last)
    {
        for (char c = first; c <= last; c++)
        {
            list.Add(c);
        }
    }
}

and then:

List<char> unallowed = new List<char>();
unallowed.AddCharRange('\u0000', '\u0008');

Depending on your use case, I would eventually name the method "Unallow" instead of "AddCharRange".

Achim