I need to tokenize following tag:
{TagName attrib1=”value1” attrib2=”value 3”}.
I would like to write regex to do it, but the trouble is that attribute value can contain space, so I can’t just split with space.
I need to tokenize following tag:
{TagName attrib1=”value1” attrib2=”value 3”}.
I would like to write regex to do it, but the trouble is that attribute value can contain space, so I can’t just split with space.
can't be put more clearly than this:
http://www.codinghorror.com/blog/2009/11/parsing-html-the-cthulhu-way.html
please explain why you need regexp...
and, you didn't say anything about your preferred language...
assuming perl:
$str = "{TagName attrib1=\"value1\" attrib2=\"value 3\"}";
if ($str =~ m/{(\w+)\s+(\w+)="(.*?)"\s+(\w+)="(.*?)"/)
{
print "tagname: $1\n";
print "attrib: $2\n";
print "value: $3\n";
print "attrib: $4\n";
print "value: $5\n";
}
But again, don't use regexps for this!!