views:

144

answers:

2

A very similar question has also been asked here on SO in case you are interested, but as we will see the accepted answer of that question is not always the case (and it's never the case for my application use-pattern).

The performance determining code consists of FileStream constructor (to open a file) and a SHA1 hash (the .Net framework implementation). The code is pretty much C# version of what was asked in the question I've linked to above.

Case 1: The Application is started either for the first time or Nth time, but with different target file set. The application is now told to compute the hash values on the files that were never accessed before.

  • ~50ms
  • 80% FileStream constructor
  • 18% hash computation

Case 2: Application is now fully terminated, and started again, asked to compute hash on the same files:

  • ~8ms
  • 90% hash computation
  • 8% FileStream constructor

Problem
My application is always in use Case 1. It will never be asked to re-compute a hash on a file that was already visited once.

So my rate-determining step is FileStream Constructor! Is there anything I can do to speed up this use case?

Thank you.

P.S. Stats were gathered using JetBrains profiler.

+1  A: 

The file system and or disk controller will cache recently accessed files / sectors.

The rate-determining step is reading the file, not constructing a FieStream object, and it's completely normal that it will be significantly faster on the second run when data is in the cache.

Joe
I don't believe this is the case. FileStream constructor does not read the entire file, the hash function calls in for that purpose. But it is the constructor that takes 80% of the time.
Alex K
A: 

You should try to use the native FILE_FLAG_SEQUENTIAL_SCAN, you will have to pinvoke CreateFile in order to get an handle and pass it to FileStream

Shay Erlichmen