Lieven's solution gets most of the way there, and as he states in his comments it's just a matter of changing the ending to Bartek's solution. The end result is the following working regEx:
(?<=")\w[\w\s]*(?=")|\w+|"[\w\s]*"
Input: Here is "my string" it has "six matches"
Output:
- Here
- is
- "my string"
- it
- has
- "six matches"
Unfortunately it's including the quotes. If you instead use the following:
(("((?<token>.*?)(?<!\\)")|(?<token>[\w]+))(\s)*)
And explicitly capture the "token" matches as follows:
RegexOptions options = RegexOptions.None;
Regex regex = new Regex( @"((""((?<token>.*?)(?<!\\)"")|(?<token>[\w]+))(\s)*)", options );
string input = @" Here is ""my string"" it has "" six matches"" ";
var result = (from Match m in regex.Matches( input )
where m.Groups[ "token" ].Success
select m.Groups[ "token" ].Value).ToList();
for ( int i = 0; i < result.Count(); i++ )
{
Debug.WriteLine( string.Format( "Token[{0}]: '{1}'", i, result[ i ] ) );
}
Debug output:
Token[0]: 'Here'
Token[1]: 'is'
Token[2]: 'my string'
Token[3]: 'it'
Token[4]: 'has'
Token[5]: ' six matches'