I've only been working with Boost.Spirit (from Boost 1.44) for three days, trying to parse raw e-mail messages by way of the exact grammar in RFC2822. I thought I was starting to understand it and get somewhere, but then I ran into a problem:
#include <iostream>
#include <boost/spirit/include/qi.hpp>
namespace qi = boost::spirit::qi;
using qi::omit;
using qi::repeat;
using std::cout;
using std::endl;
typedef qi::rule<std::string::const_iterator, std::string()> strrule_t;
void test(const std::string input, strrule_t rule) {
std::string target;
std::string::const_iterator i = input.begin(), ie = input.end();
if (qi::parse(i, ie, rule, target)) {
cout << "Success: '" << target << "'" << endl;
} else {
cout << "Failed to match." << endl;
}
}
int main() {
strrule_t obsolete_year = omit[-qi::char_(" \t")] >> repeat(2)[qi::digit] >>
omit[-qi::char_(" \t")];
strrule_t correct_year = repeat(4)[qi::digit];
test("1776", correct_year | repeat(2)[qi::digit]); // 1: Works, reports 1776.
test("76", obsolete_year); // 2: Works, reports 76.
test("76", obsolete_year | correct_year); // 3: Works, reports 76.
test(" 76", correct_year | obsolete_year); // 4: Works, reports 76.
test("76", correct_year | obsolete_year); // 5: Fails.
test("76", correct_year | repeat(2)[qi::digit]); // 6: Also fails.
}
If test #3 works, then why does test #5 -- the exact same test with the two alternatives reversed -- fail?
By the same token, if you'll pardon the expression: if test #4 works, and the space at the beginning is marked as optional, then why does test #5 (the exact same test with the exact same input, save that there's no leading space in the input) fail?
And finally, if this is a bug in Boost.Spirit (as I suspect it must be), how can I work around it?