tags:

views:

121

answers:

3

We've become fairly adept at generating various regular expressions to match input strings, but we've been asked to try to validate these strings iteratively. Is there an easy way to iteratively match the input string against a regular expression?

Take, for instance, the following regular expression:

[EW]\d{1,3}\.\d

When the user enters "E123.4", the regular expression is met. How do I validate the user's input while they type it? Can I partially match the string "E1" against the regular expression?

Is there some way to say that the input string only partially matched the input? Or is there a way to generate sub-expressions out of the master expression automatically based on string length?

I'm trying to create a generic function that can take any regular expression and throw an exception as soon as the user enters something that cannot meet the expression. Our expressions are rather simple in the grand scheme of things, and we are certainly not trying to parse HTML :)

Thanks in advance. David

+1  A: 

You could do it only by making every part of the regex optional, and repeating yourself:

^([EW]|[EW]\d{1,3}|[EW]\d{1,3}\.|[EW]\d{1,3}\.\d)$

This might work for simple expressions, but for complex ones this is hardly feasible.

Tomalak
Note that the anchors (`^` and `$`) are required or this won't work reliably.
Tomalak
A: 

Hard to say... If the user types an "E", that matches the begining but not the rest. Of course, you don't know if they will continue to type "123.4" or if they will just hit "Enter" (I assume you use "Enter" to indicate the end of input) right away. You could use groups to test that all 3 groups match, such as:

([EW])(\d{1,3})(\.\d)

After the first character, try to match the first group. After the next few inputs, match the first AND second group, and when they enter the '.' and last digit you have to find a match for all 3 groups.

FrustratedWithFormsDesigner
A: 

You could use partial matches if your regex lib supports it (as does Boost.Regex).

Adapting the is_possible_card_number example on this page to the example in your question:

#include <boost/regex.hpp>


// Return false for partial match, true for full match, or throw for
// impossible match
bool
CheckPartialMatch(const std::string& Input, const boost::regex& Regex)
{
boost::match_results<std::string::const_iterator> what;
if(0 == boost::regex_match(Input, what, Regex, boost::match_default | boost::match_partial))
{
    // the input so far could not possibly be valid so reject it:
    throw std::runtime_error(
        "Invalid data entered - this could not possibly be a match");
}

// OK so far so good, but have we finished?
if(what[0].matched)
{
    // excellent, we have a result:
    return true;
}

// what we have so far is only a partial match...
return false;
}




int main() 
{
    const boost::regex r("[EW]\\d{1,3}\\.\\d");

    // The input is incomplete, so we expect a "false" result
    assert(!CheckPartialMatch("E1", r));

    // The input completely satisfies the expression, so expect a "true" result
    assert(CheckPartialMatch("E123.4", r));

    try{
        // Input can't match the expression, so expect an exception.
        CheckPartialMatch("EX3", r);
        assert(false);
    }
    catch(const std::runtime_error&){
    }

    return 0;
}
Éric Malenfant
I'm using Boost 1.35 and we looked into partial matches, but it wouldn't tell me what segment of the regex it failed on. All the match segments that it returned had a false in the isMatched field.
David French
I'm not sure I understand your comment. By "segment", do you refer to a "marked subexpression" (http://www.boost.org/doc/libs/1_41_0/libs/regex/doc/html/boost_regex/captures.html)?
Éric Malenfant
To clarify, I updated my answer with an example
Éric Malenfant