For example i have to find time in format mentioned in the title(but %
-tags order can be different) in a string "The date is 2009-August-25."
How can i make the program interprete the tags and what construction is better to use for storing them among with information about how to act with certain pieces of date string?
views:
120answers:
2Need to parse a string, having a mask (something like this "%yr-%mh-%dy"), so i get the int values.
I'd transform the tagged string in a regular expression with capture for the 3 fields and search for it. The complexity of the regular expression will depend on what you want to accept for %yr. You can also have a less strict expression and then check for valid values, this can leads to better error messages ("Invalid month: Augsut" instead of "date not found") or to false positives depending on the context.
First look into boost::date_time
library. It has IO system witch may be what you want but I see lack of searching.
To do custom date searching you need boost::xpressive
. It contain anything you will need. Lets look into my hastily writed example. First you should parse your custom pattern, witch is easy with Xpressive. First look at header you need:
#include <string>
#include <iostream>
#include <map>
#include <boost/xpressive/xpressive_static.hpp>
#include <boost/xpressive/regex_actions.hpp>
//make example shorter but less clear
using namespace boost::xpressive;
Second define map of your special tags:
std::map<std::string, int > number_map;
number_map["%yr"] = 0;
number_map["%mh"] = 1;
number_map["%dy"] = 2;
number_map["%%"] = 3; // escape a %
Next step is to create a regex witch will parse our pattern with tags and save values from map into variable tag_id when it find tag or save -1 otherwise:
int tag_id;
sregex rx=((a1=number_map)|(s1=+~as_xpr('%')))[ref(tag_id)=(a1|-1)];
More information and description look here and here. Now lets parse some pattern:
std::string pattern("%yr-%mh-%dy"); // this will be parsed
sregex_token_iterator begin( pattern.begin(), pattern.end(), rx ), end;
if(begin == end) throw std::runtime_error("The pattern is empty!");
The sregex_token_iterator
will iterate over our tokens, and each time it will set tag_id varible. All we have to do is to build regex using this tokens. We will construct this regex using tag corresponding parts of static regex defined in array:
sregex regex_group[] = {
range('1','9') >> repeat<3,3>( _d ), // 4 digit year
as_xpr( "January" ) | "February" | "August", // not all month XD so lazy
repeat<2,2>( range('0','9') )[ // two digit day
check(as<int>(_) >= 1 && as<int>(_) <= 31) ], //only bettwen 1 and 31
as_xpr( '%' ) // match escaped %
};
Finally, lets start build our special regex. The first match will construct first part of it. If the tag is matched and tag_id is non negative we choose regex from array, else the match is probably the delimiter and we construct regex witch match it:
sregex custom_regex = (tag_id>=0) ? regex_group[tag_id] : as_xpr(begin->str());
Next we will iterate from begin to end and append next regex:
while(++begin != end)
{
if(tag_id>=0)
{
sregex nextregex = custom_regex >> regex_group[tag_id];
custom_regex = nextregex;
}
else
{
sregex nextregex = custom_regex >> as_xpr(begin->str());
custom_regex = nextregex;
}
}
Now our regex is ready, lets find some dates :-]
std::string input = "The date is 2009-August-25.";
smatch mydate;
if( regex_search( input, mydate, custom_regex ) )
std::cout << "Found " << mydate.str() << "." << std::endl;
The xpressive library is very powerful and fast. It's also beautiful use of patterns.
If you like this example, let me know in comment or points ;-)