So this is the solution I like right now:
class String
def get_column(n)
self =~ /\A\{(?:\w*\|){#{n}}(\w*)(?:\|\w*)*\}\Z/ && $1
end
end
We use a regular expression to make sure that the string is of the correct format, while simultaneously grabbing the correct column.
Explanation of regex:
\A
is the beginnning of the string and \Z
is the end, so this regex matches the enitre string.
- Since curly braces have a special meaning we escape them as
\{
and \}
to match the curly braces at the beginning and end of the string.
- next, we want to skip the first n columns - we don't care about them.
- A previous column is some number of letters followed by a vertical bar, so we use the standard
\w
to match a word-like character (includes numbers and underscore, but why not) and *
to match any number of them. Vertical bar has a special meaning, so we have to escape it as \|
. Since we want to group this, we enclose it all inside non-capturing parens (?:\w*\|)
(the ?:
makes it non-capturing).
- Now we have
n
of the previous columns, so we tell the regex to match the column pattern n
times using the count regex - just put a number in curly braces after a pattern. We use standard string substition, so we just put in {#{n}}
to mean "match the previous pattern exactly n
times.
- the first non skipped column after that is the one we care about, so we put that in capturing parens:
(\w*)
- then we skip the rest of the columns, if any exist:
(?:\|\w*)*
.
Capturing the column puts it into $1
, so we return that value if the regex matched. If not, we return nil, since this String has no n
th column.
In general, if you wanted to have more than just words in your columns (like "{a phrase or two|don't forget about punctuation!|maybe some longer strings that have\na newline or two?}"
), then just replace all the \w
in the regex with [^|{}]
so you can have each column contain anything except a curly-brace or a vertical bar.
Here's my previous solution
class String
def get_column(n)
raise "not a column string" unless self =~ /\A\{\w*(?:\|\w*)*\}\Z/
self[1 .. -2].split('|')[n]
end
end
We use a similar regex to make sure the String contains a set of columns or raise an error. Then we strip the curly braces from the front and back (using self[1 .. -2]
to limit to the substring starting at the first character and ending at the next to last), split the columns using the pipe character (using .split('|')
to create an array of columns), and then find the n'th column (using standard Array lookup with [n]
).
I just figured as long as I was using the regex to verify the string, I might as well use it to capture the column.