tags:

views:

108

answers:

4

I need to split a string at all whitespace, it should ONLY contain the words themselves.

How can I do this in vb.net?

Tabs, Newlines, etc. must all be split!

This has been bugging me for quite a while now, as my syntax highlighter I made completely ignores the first word in each line except for the very first line.

+3  A: 

Try this:

Regex.Split("your string here", "\s+")
Rubens Farias
@? What does the @ do? It gives me a syntax error.
Cyclone
It's C#. you should be fine without.
Jimmy
Sorry, its was C#; you can safely strip it
Rubens Farias
A: 
Dim words As String = "This is a list of words, with: a bit of punctuation" + _
                          vbTab + "and a tab character." + vbNewLine
Dim split As String() = words.Split(New [Char]() {" "c, CChar(vbTab), CChar(vbNewLine) })
Ed Swangren
Didn't work. Looks cool though lol
Cyclone
What do you mean it didn't work?
Ed Swangren
It simply didn't work.
Cyclone
+2  A: 

String.Split() (no parameters) does split on all whitespace (including LF/CR)

Jimmy
Why didn't they include that as an overload lol? Thanks so much!
Cyclone
because it resolves to the Split(params char[]) overload, with an empty array. The documentation for that overload mentions this behavior.
Jimmy
A: 

String.Split() will split on every single whitespace, so the result will contain empty strings usually. The Regex solution Ruben Farias has given is the correct way to do it. I have upvoted his answer but I want to give a small addition, dissecting the regex:

\s is a character class that matches all whitespace characters.

In order to split the string correctly when it contains multiple whitespace characters between words, we need to add a quantifier (or repetition operator) to the specification to match all whitespace between words. The correct quantifier to use in this case is +, meaning "one or more" occurrences of a given specification. While the syntax "\s+" is sufficient here, I prefer the more explicit "[\s]+".

Johannes Rudolph