tags:

views:

74

answers:

4

Does [_\s^"] mean underscore and whitespace but not " (quote) in Reg

I understand that the brackets ([ ]) mean character range and that ^ means but not, but my question is can you say [this^notthat] or do I have to separate them into two sets of brackets?

+8  A: 

^ is only special at the start of a character class. You can even write, [^^] to say, "not a caret".

There is no reason to match "underscore or whitespace, but not "" because by matching underscore or whitespace you are already guaranteed not to match ". Perhaps you want to say something like, "all uppercase letters except Q". In this case, the easiest option is to use subranges: [A-PR-Z].

Marcelo Cantos
+1  A: 

It means underscore, whitespace, caret, or double-quote. As Marcelo pointed out, the caret is only special if it's the first character within the brackets.

but my question is can you say [this^notthat] or do I have to separate them into two sets of brackets?

You have to separate them into two sets. [this][^that] which of course would mean "a t, h, i, or s, followed by any character except t, h, or a"

Mark
This won't work. It'll match against pairs of consecutive input characters; the OP wanted to apply inclusive and exclusive sets to each character.
Marcelo Cantos
@Marcelo: Well, that's what I wrote, isn't it? It's not clear what the OP actually wants to do. **Edit:** I responded that way because writing "include only this but not that" doesn't make sense unless "this" and "that" overlap in same way, i.e., the second part subtracts from the first. In which case I guess you could do a negated look-ahead... but I don't really want to get into that where it's not needed.
Mark
A: 

No, ^ has its negation effect only if it is in the first of a character class. For example: [abc^] matches "^" Also, - in the end of the character class means "matching -": [\w-] matches "abc-def"

SHiNKiROU
A: 

As others have pointed out ^ only means negation if it is the first character in a set.

A construct such as [ABC^DEF] to express A,B or C, but not DEF does not make sense. If it is an A, B or C, it cannot be D, E or F, so that part of the expression is redundant. If you have characters existing in both blocks you can simplify it: [this^notthat] => [is] by removing any character in the "this" part that also exists in the "that" part.

If working with larger sequences, things can be a bit more confusing.

Anders Abel
I would drop the example at the end as it is incorrect -- that expression you give happily matches "barfoo". I would drop everything after the word "confusing" :)
Zac Thompson