views:

261

answers:

5

I want to know which technique antivirus programs use for scanning disk or files and maintaining low memory consumption. They don't affect the user activity either.

I am looking for an approach by which we can achieve disk scanning with low memory consumption.

+9  A: 

They don't. Every scanner I know uses a lot of memory, and has impact on the performance.

GvS
+1. I stopped using antivirus software 6 years ago because I found that it literally cut system performance in half on every machine I knew.
Vilx-
I know, first thing I do is put my database files on the ignore list. That helps a bit on a dev machine. But to totally remove it, I think you get legal problems if you accidently distrubute a virus (with your app or email) to a customer, and they find out you are not running any antivirus stuff.
GvS
+1  A: 

I think you are overestimating the leanness of these scanning tools. I've seen them routinely take huge chunks of memory and occasionally spike the cpu for a while. They also hijack your startup to make sure they start up first, which holds up your startup.

Andy_Vulhop
Then they send in their fighters, who try to target a three-foot exhaust port right below the main port. :)
rtperson
@rtperson: lol :)
Nelson
+2  A: 

NOD32 has a pretty small footprint, but still 10-20MB in memory.

Keep in mind what AV has to do for the most part- look at the executable part of each file for malicious bytes. A traditional virus is typically less than 1000 bytes, the identifiable patterns maybe only be 50 bytes. So for AV to protect you against 100K virus patterns, it only needs a pattern database of 50*100K=5MB.

Rob Elliott
+1  A: 

You should explore memory mapped files. They allow one to process huge files without loading the entire file into memory at one time.

Michael McCloskey
But then for every (or at least a lot) original disk-read, you need to read some other part of the disk to get all the signatures.
GvS
I was not answering in the context of creating a virus scanner, but in the context of disk scanning and file processing in general.
Michael McCloskey
+3  A: 

I agree to most people that antivirus software has never had low memory or CPU consumption. However, here are a few ideas off of the top of my head:

  • Scan only the files the user opens, only when he opens them.
  • Only scan risky files - like executables or scripts, not all files.
  • The scanning is usually done by hashing the file and maching the hash against known virus hashes. To minimize memory usage you could just keep the known hashes on disk and search them when needed, but that would be very slow. The fastest way would be to keep them all in RAM and forbid the OS to swap them out, but that would use a lot of memory. A tradeoff can be achieved by several level of hash caches, like this:
    • 1st level cache contains 24-bit hashes as a bitmask. This occupies about 16MB of RAM and can be kept completely in RAM (forbidding the OS to swap it out). Checking this can be done very quickly.
    • 2nd level cache contains full 128-bit or larger hashes and is kept on disk. Only if first level cache gets hit, is the second level cache tested. Because the hash space of 1st level cache is small, it is likely to get a lot of false positives, so the second level cache has to be checked.
  • Cache the results of the last, say, 1000 files scanned. This way you don't have to do all the hashing and checking over and over again for files that are often used.
Vilx-
Ah, finally an answer he can do something with :-) I think it comes down to: If you can prove a file is not changed since the last scan, and in the last scan this file is virus free, and your virus definitions has not changed since this scan. You do not need to scan the file again.
GvS
Yup. But that's just one corner you can cut to achieve better performance. Oh, btw - if you didn't notice, my suggestions all trade memory for performance. If you want to use less memory, you shouldn't do any of them. Instead hash the file every time and seek the has on the disk every time. Keep nothing in RAM except what you need. Of course - the performance might be a bit low... ;)
Vilx-