tags:

views:

322

answers:

5

What is the best way to check if a single character is a whitespace?

I know how to check this through a regular expression.

But I am not sure if this is the best way if I only have a single character.

Isn't there a better way (concerning performance) for checking if it's a whitespace?

If I do something like this, I would miss white spaces like tabs I quess?: if (ch == ' ') { .... }

+1  A: 

this covers spaces, tabs and newlines:

if ((ch == ' ') || (ch == '\t') || (ch == '\n'))

this should be best for performance. put the whitespace character you expect to be most likely, first.

if performance is really important, probably best to consider the bigger picture than individual operations like this...

Peter
there are also '\r' and '\r\n', but I would say it's better to use regex than use an if with 5 conditions. It would be faster in my opinion.
stefita
well, it's always best to profile in cases like this. depending on how likely each character is, the short-circuiting of `||` might actually make it faster.
Peter
This would do function isWhiteSpace(ch){ return " \t\n\r\v".indexOf(ch) != -1; }
Locksfree
Thanks for all the answers.It's clear to me that it's better to use a RegExp...Better safe then sorry ;)
edbras
+3  A: 

I have referenced the set of whitespace characters matched by PHP's trim function without shame (minus the null byte, I have no idea how well browsers will handle that).

if (' \t\n\r\v'.indexOf(ch) > -1) {
    // ...
}

This looks like premature optimization to me though.

Alex Barrett
+1 for premature optimization.
Glenn
A: 

Many languages have an 'IsSpace' or similar function in the library. JavaScript doesn't, for some reason, but that doesn't stop you writing your own.

I should confess - I don't use JavaScript much, so I could be out of date on this.

Don't forget that line ends are platform dependent. On Windows, you get a carriage return and a linefeed. On old macs (before OS X, IIRC) the line end was a carriage return without a linefeed. Either way, it's best to count \r as whitespace. \v (vertical tab) is also sometimes included. In unicode, you may also need to worry about such beasts as en-spaces, em-spaces, non-breaking spaces etc.

You may be better off using a regular expression (assuming there's a match-any-whitespace), or using a strip-whitespace function and seeing what's left after.

Steve314
+3  A: 

If you only want to test for certain whitespace characters, do so manually, otherwise, use a regular expression, ie

/\s/.test(ch)

Keep in mind that different browsers match different characters, eg in Firefox, \s is equivalent to (source)

[ \f\n\r\t\v\u00A0\u2028\u2029]

whereas in Internet Explorer, it should be (source)

[ \f\n\r\t\v]

The MSDN page actually forgot the space ;)

Christoph
This is only part of a story :) Here's a full table I published recently - http://thinkweb2.com/projects/prototype/whitespace-deviations/
kangax
@kangax: nice to know; at least for IE, the documentation is consistent with the actual result (missing space aside)
Christoph
A: 
var testWhite = (x) {
    var white = new RegExp(/^\s$/);
    return white.test(x.charAt(0));
};

This small function will allow you to enter a string of variable length as an argument and it will report "true" if the first character is white space or "false" otherwise. You can easily put any character from a string into the function using the indexOf or charAt methods. Examples:

var str = "Today I wish I were not in Afghanistan.";
testWhite(str.charAt(9));  // This would test character "i" and would return false.
testWhite(str.charAt(str.indexOf("I") + 1));  // This would return true.