Edit: I'm really just curious as to how I can get this regex to work. Please don't tell me there are easier ways to do it. That's obvious! :P
I'm writing a regular expression (using Python) to parse lines in a configuration file. Lines could look like this:
someoption1 = some value # some comment
# this line is only a comment
someoption2 = some value with an escaped \# hash
someoption3 = some value with a \# hash # some comment
The idea is that anything after a hash symbol is considered to be a comment, except if the hash is escaped with a slash.
I'm trying to use a regex to break each line into its individual pieces: leading whitespace, left side of the assignment, right side of the assignment, and comment. For the first line in the example, the breakdown would be:
- Whitespace: ""
- Assignment left: "someoption1 ="
- Assignment right: " some value "
- Comment "# some comment"
This is the regex I have so far:
^(\s)?(\S+\s?=)?(([^\#]*(\\\#)*)*)?(\#.*)?$
I'm terrible with regex, so feel free to tear it apart!
Using Python's re.findAll()
, this is returning:
- 0th index: the whitespace, as it should be
- 1st index: the left side of the assignment
- 2nd index: The right side of the assignment, up to the first hash, whether escaped or not (which is incorrect)
- 5th index: The first hash, whether escaped or not, and anything after it (which is incorrect)
There's probably something fundamental about regular expressions that I'm missing. If somebody can solve this I'll be forever grateful...