views:

264

answers:

2

I apologize if you know nothing about Python, however, the following snippet should be very readable to anyone. The only trick to watch out for - indexing a list with [-1] gives you the last element if there is one, or raises an exception.

>>> fileName = 'TheFileName.Something.xMl'
>>> fileNameList = fileName.split('.')
>>> assert(len(fileNameList) > 1) # Must have at least one period in it
>>> assert(fileNameList[-1].lower() == 'xml')
>>> fileNameList[-1] = 'bak'
>>> fileName = '.'.join(fileNameList)
>>> print(fileName)
TheFileName.Something.bak

I need to convert this logic into C++ (the language I am actually using, but so far suck at) function with the following signature: void PopulateBackupFileNameOrDie(CAtlString& strBackupFileName, CAtlString& strXmlFileName);. Here strXmlFileName is "input", strBackupFileName is "output" (should I reverse the oprder of the two?). The tricky part is that (correct me if I am wrong) I am working with a Unicode string, so looking for these characters: .xmlXML is not as straight-forward. Latest Python does not have these issues because '.' and "." are both Unicode strings (not a "char" type) of length 1, both contain just a dot.

Notice that the return type is void - do not worry much about it. I do not want to bore you with details of how we communicate an error back to the user. In my Python example I just used an assert. You can do something like that or just include a comment such as // ERROR: [REASON].

Please ask if something is not clear. Suggestions to use std::string, etc. instead of CAtlString for function parameters are not what I am looking for. You may convert them inside the function if you have to, but I would prefer not mixing different string types in one function. I am compiling this C++ on Windows, using VS2010. This implies that I WILL NOT install BOOST, QTString or other libraries which are not available out of the box. Stealing a boost or other header to enable some magic is also not the right solution.

Thanks.

+3  A: 

I didn't split the string as your code does because that's a bit more work in C++ for really no gain (it's slower, and for this task you really don't need to do it).

string filename = "TheFileName.Something.xMl";
size_t pos = filename.rfind('.');
assert(pos > 0 && pos == filename.length()-4); // the -4 here is for length of ".xml"
for(size_t i = pos+1; i < filename.length(); ++i)
    filename[i] = tolower(filename[i]);
assert(filename.substr(pos+1) == "xml");
filename = filename.substr(0,pos+1) + "bak";
std::cout << filename << std::endl;
SoapBox
Good, you obviously do not have to think in Python while coding in C++. Is `string` Unicode? How do I convert a 'CAtlString` into a `string`? Is it an `std::string`? Also, where are you converting `"XmL"` into lower case?
Hamish Grubijan
Ah sorry, I'm not familar with ATL, so I'm not sure how to convert to an std::string. std::string is not unicode, but std::wstring is. Though using wstring is a little tricky because you can't directly compare it to string literals (i.e. "xml"). You'll need to use some other code to do that....
SoapBox
@SoapBox: std::string could be Unicode; it can hold characters in any 8-bit encoding, including UTF-8.
JWWalker
+6  A: 

If you're using ATL why not just use CAtlString's methods?

CAtlString filename = _T("TheFileName.Something.xMl");

//search for '.' from the end
int dotIdx = filename.ReverseFind( _T('.') );

if( dotIdx != -1 ) {
  //extract the file extension
  CAtlString ext = filename.Right( filename.GetLength() - dotIdx );

  if( ext.CompareNoCase( _T(".xml" ) ) == 0 ) {
    filename.Delete( dotIdx, ext.GetLength() ); //remove extension
    filename += _T(".bak");
  }
}
Praetorian
Good stuff, if it works, then I like it. Let me test it. By the way: dumb question: is it or is it not Unicode?
Hamish Grubijan
Yes, you're right about CompareNoCase, sorry about that, fixed it.
Praetorian
As for Unicode or not, for Visual Studio projects, this depends on your project settings; specifically whether the UNICODE macro is defined or not. ATL has string conversion functions to convert from UTF-8 to UTF-16 and back, search MSDN for "ATL 7.0 String Conversion Classes and Macros", they are called CW2A(), CT2W() etc.
Praetorian
@Hamish Grubijan: Your string will not be stored in *Unicode*. Unicode is simply a specification. Your data needs to use a specific representation of that specification, ie, UTF-8, UTF-16, UTF-32 or one of many others. The most common is UTF-8. If you're using UTF-8, then the . character is s single byte.
bluesmoon