ansaurus

Question

Answer 1

+1 A:

[!-~] Matches any of the characters between "!" and "~" (the represented characters theoretically depend on the encoding in use)

[ ] Matches a space character

(x|y) Matches one of x or y

(x)* Matches any number of subsequent occurrences of x, (including none).

Romain 2010-02-09 22:07:10

It seems like it would be better written as `([!-~ ])*[!-~]([!-~ ])*`

Anon. 2010-02-09 22:08:33

There are always other ways to write regular expressions, unless they are trivial :)

Romain 2010-02-09 22:20:12

Answer 2

+1 A:

Any characters in the range of ! to ~ or spaces, followed by one character of the range ! to ~, followed by any number of that same range or spaces again. So it would appear to be the same as:

([!-~ ])*[!-~]([!-~ ])*

Stephen Cross 2010-02-09 22:08:01

Or also equivalent to `([!-~]|[ ]?)+`. Note the fact [!-~] is actually a character class, and not a character set (it's all between ! and ~, and not !, ~ and -).

Romain 2010-02-09 22:13:10

@Romain: No, your example matches (among other incorrect things), the empty string.

Anon. 2010-02-09 22:16:15

Exact. Never mind the example, the secondary comment is still valid, though :)

Romain 2010-02-09 22:17:14

Thanks, I corrected it.

Stephen Cross 2010-02-09 22:25:08

Answer 3

+3 A:

Take in parts. Here's the first part:

([!-~]|[ ])*

This means any number (*) of the characters between ! and ~ (including ! and ~; this turns out to be all of the printable ASCII characters, if you look up ! and ~ in an ASCII table) or a space.

Here's the second part:

[!-~]

This means one character between ! and ~

Here's the last part:

([!-~]|[ ])*

This means the same thing as the first part.

So this regular expression will match any string of printable ASCII characters, including spaces, provided there is at least one printable ASCII character in the string.

Dominic Cooney 2010-02-09 22:10:13

I don't get it, what about the "|[ ]" part in first and last part of the regex? Does it mean nothing?

BeowulfOF 2010-02-09 22:16:19

It's an alternative, either one non-space character (the part below the |), or a space character (the part after the |).

Romain 2010-02-09 22:18:25

Why not just '[ -~]*[!-~][ -~]*' ?

dtmilano 2010-02-10 00:07:40

@dtmilano: you're right, `[ -~]*[!-~][ -~]*` works just fine (at least it does in RegexBuddy when I specify "XML Schema" mode).

Alan Moore 2010-02-10 02:58:22

In fact, `[ ]*[!-~][ -~]*` works too, and I think it's more readable as well as more efficient (not that efficiency is likely to be an issue).

Alan Moore 2010-02-10 03:24:55

Answer 4

+2 A:

The answers you've gotten seem to have missed one of the fundamentals of REs: a '-' inside square brackets isn't taken to mean a literal '-' unless it's the first or last character. Instead, the '-' defines a range. The '!' is (in ASCII, ISO 8859, etc.) character code 33 -- the first "visible" printable character. Likewise, in ASCII, the '~' is code 126, the last printable character.

Therefore, the "[!-~]" matches a single printable (ASCII) character.

For the rest, the other answers seem reasonable.

Edit: it looks like as I was writing this, some more accurate answers were posted -- my apologies if I offended anybody by implying otherwise. As I started writing this, the answers that had been posted were wrong on this point.

Jerry Coffin 2010-02-09 22:15:00

That's what was tripping me up. I'm glad you and others explained that.

Dave 2010-02-09 22:23:45

doh, I missed that small fact myself :(

bramp 2010-02-10 01:20:20

Answer 5

+1 A:

The regular expression consists of:

([!-~]|[ ])* start with zero or more characters of the range from ! (0x21) to ~ (0x7E) or the space character (0x20), so basically all printable characters from 0x21 to 0x7E plus the space character
[!-~] followed by a single printable character
([!-~]|[ ])* followed by zero or more printable characters or the space character

So it basically says that the string must only contain printable characters or the space character and there must be at least one printable character.

Gumbo 2010-02-09 22:15:16

ansaurus

tags:

views:

answers:

What does this regex mean?

related questions