tags:

views:

46

answers:

3

OK, I have txt files that I am parsing and saving into a sql db. The names are formatted like

R306025COMP_272A4075_20090929_080159.txt

However, there are a select few (out of thousands of files) with names that are formatted differently (particularly files that were generated as tests), example

R306025COMP_SU2_TestBottom_20090915_101441.txt

The reason this causes a problem for me is that I am using Split('_')[1,2,etc] to extract the R number, the 272A4075 portion, and the 20090929 (date) portion. When the application comes across the oddly named files, it fails because it is trying to parse 'TestBottom' as a date and inserts 'SU2' instead of the 272 number.

Basically I want the app to recognize that if the file's name is not formatted like my first example, skip it. Any advice?

+2  A: 

Can you just do the following based on the split:

string[] parsedLine = yourData.Split('_');
string theR = parsedLine[0];
string theCode = parsedLine[1];
string theDatePart = (parsedLine.Length > 4) ? parsedLine[3] : parsedLine[2];

If you want it to just skip it the bad lines just do:

string[] parsedLine = yourData.Split('_');
if (parsedLine.Length > 4) continue;  // assuming your looping

Would need to see a little code if you want a better solution as I am not exactly sure how you are getting the line data.

Kelsey
@jakesankey You should only do the split once and store it and then do a check if the data is valid BEFORE trying to write to the DB. Doing it all inline with no validation seems to be your problem.
Kelsey
the problem with this would be if the name changed to something like R306025COMP_TestBottom_20090915_101441_SU2.txt .. Then what would happen???
jakesankey
A: 
foreach (var fileName in fileNames) {
    if (fileName.Count(c => c == '_') != 3) continue;
    // etc...
}
kevingessner
+1  A: 

Use a regex match on the file name, which means it will match the regex anywhere in the filename (so you don't have to be concerned about exactly where in the string it occurs, and the exact result is extracted out for you). Then if you fail to find the required matches, skip the file (no exceptions are generated, you just get nothing in your Matches object).

I'd do you up a sample but i don't have VS handy at the moment. The Regex stuff lives in the System.Text.RegularExpressions namespace.

slugster
I would love to see an example of this!
jakesankey