Let's start by breaking the problem down. Your requirement is that you need to compute several different kinds of hashes on the same file. Assume for the moment that you don't need to actually instantiate the types. Start with a function that has them already instantiated:
public IEnumerable<string> GetHashStrings(string fileName,
IEnumerable<HashAlgorithm> algorithms)
{
byte[] fileBytes = File.ReadAllBytes(fileName);
return algorithms
.Select(a => a.ComputeHash(fileBytes))
.Select(b => HexStr(b));
}
That was easy. If the files might be large and you need to stream it (keeping in mind that this will be much more expensive in terms of I/O, just cheaper for memory), you can do that too, it's just a little more verbose:
public IEnumerable<string> GetStreamedHashStrings(string fileName,
IEnumerable<HashAlgorithm> algorithms)
{
using (Stream fileStream = File.OpenRead(fileName))
{
return algorithms
.Select(a => {
fileStream.Position = 0;
return a.ComputeHash(fileStream);
})
.Select(b => HexStr(b));
}
}
It's a little gnarly and in the second case it's highly questionable whether or not the Linq-ified version is any better than an ordinary foreach
loop, but hey, we're having fun, right?
Now that we've disentangled the hash-generation code, instantiating them first isn't really that much more difficult. Again we'll start with code that's clean - code that uses delegates instead of types:
public IEnumerable<string> GetHashStrings(string fileName,
params Func<HashAlgorithm>[] algorithmSelectors)
{
if (algorithmSelectors == null)
return Enumerable.Empty<string>();
var algorithms = algorithmSelectors.Select(s => s());
return GetHashStrings(fileName, algorithms);
}
Now this is much nicer, and the benefit is that it allows instantiation of the algorithms within the method, but doesn't require it. We can invoke it like so:
var hashes = GetHashStrings(fileName,
() => new MD5CryptoServiceProvider(),
() => new SHA1CryptoServiceProvider());
If we really, really, desperately need to start from the actual Type
instances, which I'd try not to do because it breaks compile-time type checking, then we can do that as the last step:
public IEnumerable<string> GetHashStrings(string fileName,
params Type[] algorithmTypes)
{
if (algorithmTypes == null)
return Enumerable.Empty<string>();
var algorithmSelectors = algorithmTypes
.Where(t => t.IsSubclassOf(typeof(HashAlgorithm)))
.Select(t => (Func<HashAlgorithm>)(() =>
(HashAlgorithm)Activator.CreateInstance(t)))
.ToArray();
return GetHashStrings(fileName, algorithmSelectors);
}
And that's it. Now we can run this (bad) code:
var hashes = GetHashStrings(fileName, typeof(MD5CryptoServiceProvider),
typeof(SHA1CryptoServiceProvider));
At the end of the day, this seems like more code but it's only because we've composed the solution effectively in a way that's easy to test and maintain. If we wanted to do this all in a single Linq expression, we could:
public IEnumerable<string> GetHashStrings(string fileName,
params Type[] algorithmTypes)
{
if (algorithmTypes == null)
return Enumerable.Empty<string>();
byte[] fileBytes = File.ReadAllBytes(fileName);
return algorithmTypes
.Where(t => t.IsSubclassOf(typeof(HashAlgorithm)))
.Select(t => (HashAlgorithm)Activator.CreateInstance(t))
.Select(a => a.ComputeHash(fileBytes))
.Select(b => HexStr(b));
}
That's all there really is to it. I've skipped the delegated "selector" step in this final version because if you're writing this all as one function you don't need the intermediate step; the reason for having it as a separate function earlier is to give as much flexibility as possible while still maintaining compile-time type safety. Here we've sort of thrown it away to get the benefit of terser code.
Edit: I will add one thing, which is that although this code looks prettier, it actually leaks the unmanaged resources used by the HashAlgorithm
descendants. You really need to do something like this instead:
public IEnumerable<string> GetHashStrings(string fileName,
params Type[] algorithmTypes)
{
if (algorithmTypes == null)
return Enumerable.Empty<string>();
byte[] fileBytes = File.ReadAllBytes(fileName);
return algorithmTypes
.Where(t => t.IsSubclassOf(typeof(HashAlgorithm)))
.Select(t => (HashAlgorithm)Activator.CreateInstance(t))
.Select(a => {
byte[] result = a.ComputeHash(fileBytes);
a.Dispose();
return result;
})
.Select(b => HexStr(b));
}
And again we're kind of losing clarity here. It might be better to just construct the instances first, then iterate through them with foreach
and yield return
the hash strings. But you asked for a Linq solution, so there you are. ;)