views:

120

answers:

2

For example i have to find time in format mentioned in the title(but %-tags order can be different) in a string "The date is 2009-August-25." How can i make the program interprete the tags and what construction is better to use for storing them among with information about how to act with certain pieces of date string?

+1  A: 

I'd transform the tagged string in a regular expression with capture for the 3 fields and search for it. The complexity of the regular expression will depend on what you want to accept for %yr. You can also have a less strict expression and then check for valid values, this can leads to better error messages ("Invalid month: Augsut" instead of "date not found") or to false positives depending on the context.

AProgrammer
+1  A: 

First look into boost::date_time library. It has IO system witch may be what you want but I see lack of searching.

To do custom date searching you need boost::xpressive. It contain anything you will need. Lets look into my hastily writed example. First you should parse your custom pattern, witch is easy with Xpressive. First look at header you need:

#include <string>
#include <iostream>
#include <map>
#include <boost/xpressive/xpressive_static.hpp>
#include <boost/xpressive/regex_actions.hpp>

//make example shorter but less clear
using namespace boost::xpressive;

Second define map of your special tags:

std::map<std::string, int > number_map;
number_map["%yr"] = 0;
number_map["%mh"] = 1;
number_map["%dy"] = 2;
number_map["%%"] = 3;  // escape a %

Next step is to create a regex witch will parse our pattern with tags and save values from map into variable tag_id when it find tag or save -1 otherwise:

int tag_id;
sregex rx=((a1=number_map)|(s1=+~as_xpr('%')))[ref(tag_id)=(a1|-1)];

More information and description look here and here. Now lets parse some pattern:

  std::string pattern("%yr-%mh-%dy"); // this will be parsed

  sregex_token_iterator begin( pattern.begin(), pattern.end(), rx ), end;
  if(begin == end) throw std::runtime_error("The pattern is empty!");

The sregex_token_iterator will iterate over our tokens, and each time it will set tag_id varible. All we have to do is to build regex using this tokens. We will construct this regex using tag corresponding parts of static regex defined in array:

sregex regex_group[] = {
    range('1','9') >> repeat<3,3>( _d ), // 4 digit year
    as_xpr( "January" ) | "February" | "August", // not all month XD so lazy
    repeat<2,2>( range('0','9') )[    // two digit day
    check(as<int>(_) >= 1 && as<int>(_) <= 31) ], //only bettwen 1 and 31
    as_xpr( '%' )  // match escaped %
};

Finally, lets start build our special regex. The first match will construct first part of it. If the tag is matched and tag_id is non negative we choose regex from array, else the match is probably the delimiter and we construct regex witch match it:

sregex custom_regex = (tag_id>=0) ? regex_group[tag_id] : as_xpr(begin->str());

Next we will iterate from begin to end and append next regex:

while(++begin != end)
{
    if(tag_id>=0)
    {
        sregex nextregex = custom_regex >> regex_group[tag_id];
        custom_regex = nextregex;
    }
    else
    {
        sregex nextregex = custom_regex >> as_xpr(begin->str());
        custom_regex = nextregex;
    }
}

Now our regex is ready, lets find some dates :-]

std::string input = "The date is 2009-August-25.";

smatch mydate;
if( regex_search( input, mydate, custom_regex ) )
    std::cout << "Found " << mydate.str() << "." << std::endl;

The xpressive library is very powerful and fast. It's also beautiful use of patterns.

If you like this example, let me know in comment or points ;-)

lionbest