views:

175

answers:

3

I have read an entire file into a string from a memory mapped file Win API

CreateFile( "WarandPeace.txt", GENERIC_READ, FILE_SHARE_READ, 0, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, 0 )

etc...

Each line is terminated with a CRLF. I need to find something on a line like "Spam" in the line "I love Spam and Eggs" (and return the entire line (without the CRLF) in a string (or a pointer to the location in the string) The original string cannot be altered.

EDITED:

Something like this:

string ParseStr( string sIn, string sDelim, int nField )
{  
    int match, LenStr, LenDelim, ePos, sPos(0), count(0);
    string sRet;

     LenDelim = sDelim.length();
     LenStr   = sIn.length();
     if( LenStr < 1 || LenDelim < 1 ) return ""; // Empty String
     if( nField < 1 ) return "";
     //=========== cout << "LenDelim=" << LenDelim << ", sIn.length=" << sIn.length() << endl;


        for( ePos=0; ePos < LenStr; ePos++ ) // iterate through the string
     { // cout << "sPos=" << sPos << ", LenStr=" << LenStr << ", ePos=" << ePos << ", sIn[ePos]=" << sIn[ePos] << endl;
      match = 1; // default = match found
      for( int k=0; k < LenDelim; k++ ) // Byte value 
      {  
       if( ePos+k > LenStr ) // end of the string
        break;
       else if( sIn[ePos+k] != sDelim[k] ){ // match failed
        match = 0; break; }
      }
      //===========

      if( match || (ePos == LenStr-1) )  // process line
      { 
       if( !match ) ePos = LenStr + LenDelim; // (ePos == LenStr-1) 
       count++; // cout << "sPos=" << sPos << ", ePos=" << ePos << " >" << sIn.substr(sPos, ePos-sPos) << endl;
       if( count == nField ){ sRet = sIn.substr(sPos, ePos-sPos); break; } 
       ePos = ePos+LenDelim-1; // jump over Delim
       sPos = ePos+1; // Begin after Delim
      } // cout << "Final ePos=" << ePos << ", count=" << count << ", LenStr=" << LenStr << endl;
     }// next

    return sRet;      
}

If you like it, vote it up. If not, let's see what you got.

+2  A: 

If you are trying to match a more complex pattern then you can always fall back to boost's regex lib.

See: http://www.boost.org/doc/libs/1%5F41%5F0/libs/regex/doc/html/index.html

#include <iostream>
#include <string>
#include <boost/regex.hpp>

using namespace std;

int main( ) 
{
   std::string s;
   std::string sre("Spam");
   boost::regex re;

   ifstream in("main.cpp");
   if (!in.is_open()) return 1;

   string line;
   while (getline(in,line))
   {
      try
      {
        // Set up the regular expression for case-insensitivity
        re.assign(sre, boost::regex_constants::icase);
      }
      catch (boost::regex_error& e)
      {
        cout << sre << " is not a valid regular expression: \""
          << e.what() << "\"" << endl;
         continue;
      }
      if (boost::regex_match(line, re))
      {
         cout << re << " matches " << line << endl;
      }
    }
}
chollida
how efficient is that library?
Mike Trader
sre is never initialized; and pattern is never used.
ScottJ
@ScottJ, thanks it was pseudo code, I've updated it.
chollida
@Mike Trader: The boost regex engine is pretty darn fast, see http://www.boost.org/doc/libs/1_40_0/libs/regex/doc/gcc-performance.html
chollida
Is it faster than the code above do you think?
Mike Trader
@Mike Trader, accidentally replied to the original post. I'm not sure if it would be faster, but it's certainly more flexible.
chollida
A: 

system("grep ....");

pm100
Doesnt that just return a count? Perhaps you could expand...
Mike Trader
It's in Windows (Win API). Definitely no "grep"...
billyswong
A: 

Do you really have to do it in C++? Perhaps you could use a language which is more appropriate for text processing, like Perl, and apply a regular expression.

Anyway, if doing it in C++, a loop over Prev_delim_position = sIn.find(sDelim, Prev_delim_position) looks like a fine way to do it.

Igor Oks
Yes C++ is the language I must use for this. This concept of your method is what I had in mind, but using pointers instead of .find()
Mike Trader