views:

62

answers:

3

I am trying to do the following: we are trying to design a fraud detection system for stock market. I know the Specification for the frauds (they are like templates). so I want to know if I can design a template, and find all records that match this template.


Notice: I can't use the traditional queries cause the templates are complex for example one of my Fraud is circular trading,it's like this : A bought from B, and B bought from C, And C bought from A (it's a cycle) and this cycle can include 4 or 5 persons.

is there any good suggestion for this situation.

A: 

Theoretically you could develop a "Small Language" first, something with a simple syntax (that makes expressing the domain - in your case fraud patterns - easy) and from it generate one or more SQL queries.

As most solutions, this could be thought of as a slider: at one extreme there is the "full Fraud Detection Language" at the other, you could just build stored procedures for the most common cases, and write new stored procedures which use the more "basic" blocks you wrote before to implement the various patterns.

What you are trying to do falls under the Data Mining umbrella, so you could also try to learn more about it: maybe you can find a Data Mining package for your specific DB (you didn't specify) and see if it helps you finding common patterns in your data.

p.marino
thanks for your answer...about data mining I have an idea about it, but as i read data mining works in this way : you have data(records) and mining Algorithm will find the patterns, I am trying to do the opposite thing, I have patterns and I want to find the matched records>can I use data mining in my situation ??
Hany
Data Mining is not just "running a Data Mining package and *magically* finding out patterns". Tools in this category (see http://www.datamininglab.com/Portals/0/tool eval articles/smc98_abbott_mat_eld.pdf) give you the ability to express what kind of patterns you find meaningful and look for them.Try googling for "data mining fraud detection" for more about this.
p.marino
A: 

I don't see why you can't use "traditional queries" as you've stated. SQL can be used to write extraordinarily complex queries. For that matter I'm not sure that this is a hugely challenging question.

Firstly, I'd look at the behavior you have described as vary transactional, therefore I treat the transactions as a model. I'd likely have a transactions table with some columns like buyer, seller, amount, etc...

You could alternatively have the shares as its own table and store say the previous 100 owners of that share in the same table using STI (Single Table Inheritance) buy putting all the primary keys of the owners into an "owners" column in your shares table like 234/823/12334/1234/... that way you can do complex queries and see if that share was owned by the same person or look for patterns in the string really easily and quickly.

-update-

I wouldn't suggest making up a "small language" I don't see why you'd want to do something like that when you have huge selection of wonderful languages and databases to choose from, all of which have well refined and tested methods to solve exactly what you are doing.

My best advice is pop open your IDE (thumbs up for TextMate) and pick your favorite language (Ruby in my case). Find some sample data and create your database and start writing some code! You can't go wrong trying to experiment like this, it'll will totally expose better ways to go about it than we can dream up here on Stackoverflow.

Joseph Silvashy
A "small language" allow non-programmers to be able to express new patterns and possibly try them on your DB without having to code everything from scratch. Your solution is good for "exploring" and may be ok for a little experimenting but doesn't seem very amenable to a medium/long term solution.
p.marino
One more thing: the OP seems to be working on a "product" (we are trying to design a fraud detection system for stock market) - I doubt that "just fire up the text editor and code away" is the right answer in these cases.
p.marino
Thanks 'p', I don't see how using Ruby and some time tested platforms like MySQL is merely good for exploring. The point with popping open your text editor is to inspire people to get out there and test some of their own solutions with actual data. We can get theoretical about all sorts of systems, but you need to apply it.
Joseph Silvashy
A: 

Definitely Data Mining. But as you point out, you've already got the models (your templates). Look up fraud DETECTION rather than prevention for better search results?

I know a some banks use SPSS PASW Modeler for fraud detection. This is very intuitive and you can see what you are doing as you play around with the data. So you can implement your templates. I agree with Joseph, you need to get playing, making some new data structures.

Maybe a timeseries model?

Eddie