What does this regular express mean. It is in an XML schema that I am using:
([!-~]|[ ])*[!-~]([!-~]|[ ])*
-Dave
What does this regular express mean. It is in an XML schema that I am using:
([!-~]|[ ])*[!-~]([!-~]|[ ])*
-Dave
[!-~]
Matches any of the characters between "!" and "~" (the represented characters theoretically depend on the encoding in use)
[ ]
Matches a space character
(x|y)
Matches one of x or y
(x)*
Matches any number of subsequent occurrences of x, (including none).
Any characters in the range of ! to ~ or spaces, followed by one character of the range ! to ~, followed by any number of that same range or spaces again. So it would appear to be the same as:
([!-~ ])*[!-~]([!-~ ])*
Take in parts. Here's the first part:
([!-~]|[ ])*
This means any number (*) of the characters between !
and ~
(including !
and ~
; this turns out to be all of the printable ASCII characters, if you look up !
and ~
in an ASCII table) or a space.
Here's the second part:
[!-~]
This means one character between !
and ~
Here's the last part:
([!-~]|[ ])*
This means the same thing as the first part.
So this regular expression will match any string of printable ASCII characters, including spaces, provided there is at least one printable ASCII character in the string.
The answers you've gotten seem to have missed one of the fundamentals of REs: a '-' inside square brackets isn't taken to mean a literal '-' unless it's the first or last character. Instead, the '-' defines a range. The '!' is (in ASCII, ISO 8859, etc.) character code 33 -- the first "visible" printable character. Likewise, in ASCII, the '~' is code 126, the last printable character.
Therefore, the "[!-~]" matches a single printable (ASCII) character.
For the rest, the other answers seem reasonable.
Edit: it looks like as I was writing this, some more accurate answers were posted -- my apologies if I offended anybody by implying otherwise. As I started writing this, the answers that had been posted were wrong on this point.
The regular expression consists of:
([!-~]|[ ])*
start with zero or more characters of the range from !
(0x21) to ~
(0x7E) or the space character (0x20), so basically all printable characters from 0x21 to 0x7E plus the space character[!-~]
followed by a single printable character([!-~]|[ ])*
followed by zero or more printable characters or the space characterSo it basically says that the string must only contain printable characters or the space character and there must be at least one printable character.