views:

180

answers:

2

I'm storing papers in SQL Server 2005 and am looking for a way to paste in the text of a paper and then search for potential plagiarism (copied content) in the database.

What's the best way to go about this? Is there a way to get a gauge for the extent to which something is similar to something else using full-text indexing, for several paragraphs of content?

A: 

This is not really the sort of problem that full-text indexing in SQL Server is designed to solve. There's nothing built in to SQL Server that you can really use to help with this.

There are a number of specialised plagiarism detection tools, which a Google search will turn up for you. That's probably your best bet.

David M
+1  A: 

why don't you install google desktop and have it only index that one directory

then you can have google do the indexing for you

iterationx
I'm intrigued by your answer -- should I export everything from SQL text fields to a folder?
Caveatrob
if you can export it to a text file, then google desktop can parse it. seems like it would work fine.
iterationx
i actually thought you had a bunch of text files in a folder that you were loading into sql, so if you had a bunch of txt files in a folder you wouldnt need to do anything at all, just point google desktop at it.
iterationx