tags:

views:

61

answers:

2

I need to use C# to search a directory (C:\Logs) for log files whose name starts with ACCESS. Once I find a file that begins with ACCESS I need to search that file and make a collection of strings that start with Identity=" " An example would be Identity="SWN\smithj" so I need everything from Identity to the last double quotes collected. After I have reached the end of the file, I need to go to the next file that begins with ACCESS. Can someone show me how to do this in C#?

Many thanks

+2  A: 

It looks like you've got two functions here:
1) Find the Files with names like ACCESS*
2) Search those files for lines like "Identity=*"

To do the first, use a DirectoryInfo object and the GetFiles() method with a search pattern of "ACCESS*".

DirectoryInfo myDir = new DirectoryInfo(dirPath);
var files = DirectoryInfo.GetFiles("ACCESS*");

Then you'll loop through those files looking for the data you need.

List<Tuple<string, string>> IdentityLines = new List<Tuple<string, string>>();//Item1 = filename, Item2 = line
foreach(FileInfo file in files)
{
    using(StreamReader sr = new StreamReader(file.FullName) //double check that file.FullName I don't remember for sure if it's right
    {
        while(!string.IsNullOrEmpty(string line = sr.Read())
        {
           if(line.StartsWith("Identity=")) 
              IdentityLines.Add(file.FileName, line);
        }
    }
}

This hasn't been compiled, so double check it, but it should be pretty close to what you need.

EDIT: Added full solution based on comments from OP. Has been compiled and run.

DirectoryInfo myDir = new DirectoryInfo(@"C:\Testing");
var Files = myDir.GetFiles("ACCESS*");

List<KeyValuePair<string, string>> IdentityLines = new List<KeyValuePair<string, string>>();

foreach(FileInfo file in Files)
{
    string line = "";
    using(StreamReader sr = new StreamReader(file.FullName))
    {
        while(!String.IsNullOrEmpty(line = sr.ReadLine()))
        {
           if(line.ToUpper().StartsWith("IDENTITY="))
              IdentityLines.Add(new KeyValuePair<string, string>(file.Name, line));
        }
    }
}

foreach(KeyValuePair<string, string> line in IdentityLines) 
{
    Console.WriteLine("FileName {0}, Line {1}", line.Key, line.Value);
}
AllenG
Hi Allen-The line just before your loop, is that correct? What is tuple and what is IdentityLines?
Josh
A Tuple<T, T> is a paired type. Basically it's just designed to create a correllation between two pieces of data. It's new to C#4, so if your using 3.5 or earlier, you can replace it with a List<KeyValuePair<string, string>>. IdentityLines is just the variable name I gave to the list. Once you're done gathering your lines, you can then output them along with the file in which they were found.
AllenG
@Allen - I used List<KeyValuePair<string,string>> IdentityLines; In the while loop on the last line I have IdentityLines.Add(file.Name, line); and I'm getting a "No overload method for ADD takes two arguments. Thoughts?
Josh
It's cause I missed a step. Try `IdentityLines.Add(new KeyValuePair<string, string>(file.Name, line))`. You may have to play with it (thank goodness for intellisense) to get it exactly right.
AllenG
The "no overload method for 'ADD' takes two arguments" error still occurs on last lineList<KeyValuePair<string, string>> IdentityLines = new List<KeyValuePair<string, string>>(); foreach(FileInfo file in files) { using(StreamReader sr = new StreamReader(file.FullName)) { string line; while(!string.IsNullOrEmpty(line = sr.Read().ToString())) { if (line.StartsWith("Identity=")) IdentityLines.Add(file.Name, line); } } }
Josh
oh ok- i didn't see your last post until now. Let me try that
Josh
With a StreamReader you can also read the entire file into memory with the StreamReader.ReadToEnd() method (works well for smallish files).
Edward Leno
that fixed the error. So now all string that start with Identity=" " will be stored in IdentityLines? I assume I can just print those out to a text file now?
Josh
@Edward: Indeed you can, but this way he can grab just what he needs and ignore the rest. <br />@Josh: That's the theory. Like I said, I haven't actually run that code, so there may be additional errors, but you should be able to print that list to screen, or drop to a file, or whatever you need to do with it.
AllenG
@Allen: Hi Allen, do you see anything wrong with the code below? For some reason I never get the messagebox to pop up which means nothing is getting added to IdentityLines string line; while(!string.IsNullOrEmpty(line = sr.Read().ToString())) { if (line.StartsWith("Identity=")) { IdentityLines.Add(new KeyValuePair<string, string>(file.Name, line)); MessageBox.Show("Your inside"); }
Josh
@Josh: It may depend on your files. I'm updating my code with something I've now actually compiled and run.
AllenG
Hi Allen- Your code works but I think there is a problem with .StartsWith for the string search. Each line in the log file has a rather long string and within that long string would be something like yadayada Identity="swn\smithj" yadayada So what I need to make a collection of is just the Identity="swn\smithj" It was picking up anything with .StartsWith but when I changed it to .Contains then it was picking up and returning the entire line not just Identity="swn\rodgert"
Josh
You may need to work with your search string, then. It's also possible that Regex will help with that (I can do a little regex, but there are others on SO who are far better). That said, looking at @Dan Tao's suggestion looks a lot cleaner than what I'm suggesting, assuming you're on VS2K8.
AllenG
Allen - Your solution ended up working. Do you know how I can get rid of duplicate values before I start printing the result out?
Josh
@Josh: define a duplicate value. Do you mean a second entry of the same "IDENTITY=XXXXX" with a given key, or just any duplicate entries for "IDENTIY=XXXXXX" regardless of key?
AllenG
Yeah sorry about that Allen. Within my large string Identity can be equal to "swn\smithj" or maybe "swn\rodgersb" etc etc. So when I'm printing out each of these results I sometimes have duplicates like Identity="swn\smithj" Identity="smithj" Does that help? I'm a little confused on what you meant.
Josh
@Josh-At this point, I'd say make this a new question: Reference this question, but then give a sample of the output you're getting that's in error and what you'd prefer to be getting.
AllenG
Yeah I wondered about asking that additional question before asking it. I'll try to make a new question and reference this question. Thanks again
Josh
+2  A: 

Here's a pretty terse way to accomplish what you're after.

public static IEnumerable<string> GetSpecificLines(this DirectoryInfo dir, string fileSearchPattern, Func<string, bool> linePredicate)
{
    FileInfo[] files = dir.GetFiles(fileSearchPattern);

    return files
        .SelectMany(f => File.ReadAllLines(f.FullName))
        .Where(linePredicate);
}

Usage:

var lines = new DirectoryInfo("C:\Logs")
    .GetSpecificLines("ACCESS*", line => line.StartsWith("Identity="));
Dan Tao
Very nice use of linq. I wonder if chaining the Where() filter off ReadAllLines is more efficient? e.g. .SelectMany(f => File.ReadAllLines(f.FullName).Where(linePredicate));This way you wouldn't potentially store a whole lot of data in a buffer prior to filtering.
Jacob
@Jacob: I see what you mean; but unless I'm mistaken, it really shouldn't make any difference. Since `SelectMany` and `Where` are lazily evaluated, the steps will be the same: the lines within each `string[]` array returned by `File.ReadAllLines` will be enumerated over individually, and only those matching `linePredicate` will be returned. If you step through the code in a debugger you'll see what I mean.
Dan Tao
@Jacob: (In other words what I'm saying is that you won't be storing "a whole lot of data in a buffer" either way -- except for the `string[]` arrays returned by `File.ReadAllLines`, again, either way -- because the Linq extension methods provide lazy evaluation.)
Dan Tao
Just ran a test. You're right.
Jacob
Hi Dan- Where you are saying Usage: var lines = new DirectoryInfo("C:\Logs") when I place a period right after that last parantheses am I supposed to see GetSpecificLines as an option because I don't. I see many other options but not GetSpecificLines. Am i missing something?
Josh
@Josh: The example I provided happens to be an extension method. To get that to work you would define it from inside a static class. Then once you've compiled the project the method should appear in Intellisense. If you are using an older version of .NET, you will just have to make it a regular static method.
Dan Tao