tags:

views:

353

answers:

3

Hi,

I know how antivirus detects viruses. I read few aticles:

http://stackoverflow.com/questions/1396443/how-do-antivirus-programs-detect-viruses

http://www.antivirusworld.com/articles/antivirus.php

http://www.agusblog.com/wordpress/what-is-a-virus-signature-are-they-still-used-3.htm

During this one month vacation I'm having. I want to learn & code a simple virus detection program: So, there are 2-3 ways (from above articles):

  1. Virus Dictionary : Searching for virus signatures
  2. Detecting malicious behavior

I want to take the 2nd approach. I want to start off with simple things.

As a side note, recently I encountered a software named "ThreatFire" for this purpose. It does a pretty good job.

  1. 1st thing I don't understand is how can this program inter vent an execution of another between and prompt user about its action. Isnt it something like violation?
  2. How does it scan's memory of other programs? A program is confined to only its virtual space right?
  3. Is C# .NET correct for doing this kind of stuff?
  4. Please post your ideas on how to go about it? Also mention some simple things that I could do.
+4  A: 
  1. This happens because the software in question likely has a special driver installed to allow it low level kernel access which allows it to intercept and deny various potentially malicious behavior.

  2. By having the rights that many drivers do, this grants it the ability to scan another processes memory space.

  3. No. C# needs a good chunk of the operating system already loaded. Drivers need to load first.

  4. Learn about driver and kernel level programming. . . I've not done so, so I can't be of more help here.

Jason D
claws
I want to know all possible alternatives I might have. As you said you din't do any of the stuff you mentioned. So, I'm waiting for someone who has good knowledge about these things to throw some more light.
claws
"Learn about driver level programming" is the starting point. Unless you understand how to create a relatively simple driver your ability to do something fancy like a virus checker will be hampered.
Jason D
Claws, hooking is driver-level programming. Jason is suggesting that instead of jumping straight for some sort of virus detection, you learn the basics- which is usually a Good Idea.@Jason - On #3- only the data gathering needs to be kernel level. Most of the anti-virus program could be in whatever language you'd like (C#,...).
Matt Luongo
@Matt, Fair enough on #3, however I was always of the impression that a userspace app communicating with a kernel level driver would be less efficient and induce sever slowdowns (ala McAfee) if you placed your detection heuristics there.
Jason D
+3  A: 

I think system calls are the way to go, and a lot more doable than actually trying to scan multiple processes' memory spaces. While I'm not a low-level Windows guy, it seems like this can be accomplished using Windows API hooks- tie-ins to the low-level API that can modify system-wide response to a system call. These hooks can be installed as something like a kernel module, and intercept and potentially modify system calls. I found an article on CodeProject that offers more information.

In a machine learning course I took, a group decided to try something similar to what you're describing for a semester project. They used a list of recent system calls made by a program to determine whether or not the executing program was malicious, and the results were promising (think 95% recognition on new samples). In their project, they trained using SVMs on windowed call lists, and used that to determine a good window size. After that, you can collect system call lists from different malicious programs, and either train on the entire list, or find what you consider "malicious activity" and flag it. The cool thing about this approach (aside from the fact that it's based on ML) is that the window size is small, and that many trained eager classifiers (SVM, neural nets) execute quickly.

Anyway, it seems like it could be done without the ML if it's not your style. Let me know if you'd like more info about the group- I might be able to dig it up. Good luck!

Matt Luongo
Guess what?? I was also planning to do exactly similar thing, "using SVM" for classification of malicious activity. Could you please give me more info.
claws
I'm having some trouble finding their final paper- I'm not sure if they went on to publish or not. The class was Intro to Statistical Machine Learning, CS 4/7641 at Georgia Tech, taught by Charles Isbell in Spring '09. I'm shooting him an email now to ask if I can get a copy of the paper.
Matt Luongo
This is fairly similar, though - "Using Support Vector Machine To Detect Unknown Computer Viruses", Zhang et al, International Journal of Computational Intelligence Research, Volume 2, No. 1, 2006. Google Docs had a quick view up. Anyway, I'll let you know when I hear back- I realize none of this answers the question at hand - Windows API hooking.
Matt Luongo
+1  A: 
  1. Windows provides APIs to do that (generally the involve running at least some of your code in kernel). If you have sufficient privileges, you can also inject a .dll into other process. See http://en.wikipedia.org/wiki/DLL_injection.

  2. When you have the powers described above, you can do that. You are either in kernel space and have access to everything, or inside the target process.

  3. At least for the low-level in-kernel stuff you'd need something more low-level than C#, like C or C++. I'm not sure, but you might be able to do some of the rest things in a C# app.

  4. The DLL injection sounds like the simplest starting point. You're still in user space, and don't have to learn how to live in the kernel world (it's completely different world, really).

Some loose ideas on topic in general:

  • you can interpose system calls issued by the traced process. It is generally assumed that a process cannot do anything "dangerous" without issuing a system call.
  • you can intercept its network traffic and see where it connects to, what does it send, what does it receive, which files does it touch, which system calls fail
  • you can scan its memory and simulate its execution in a sandbox (really hard)
  • with the system call interposition, you can simulate some responses to the system calls, but really just sandbox the process
  • you can scan the process memory and extract some general characteristics from it (connects to the network, modifies registry, hooks into Windows, enumerates processes, and so on) and see if it looks malicious
  • just put the entire thing in a sandbox and see what happens (a nice sandbox has been made for Google Chrome, and it's open source!)
phjr
Thank you. This is very Helpful.
claws