views:

283

answers:

6

Hey,

I got to create script that would compare thousands of large texts (to each other). And I'm wondering if MySQL is the best solution for this. Is there any other free databse system I could use to do simple - but processor-time consuming computing?

Please, throw me into your knowledge's pool!

Edit: Nature of documents - 500-7000 character documents, -> comparing the documents if text matches the other document (plagiarism) and statistics -> % match of any sentence found with nice setups like that I'd like to set how many characters the other string could be different to be still considered as a match..

Technology should be server-based, I'm more interestedin DB and then I'd choose appropriate language to script it with.

More specification: The size of DB must be unlimited.

A: 

Sql Server Express

jdelator
4GB max size too low. =( sorry I didn't mention it before
Skuta
A: 

You don't mention the technology you will be using; size of the text entries or nature of the comparisons. However, I have founded h2 database to be excellent. It is native java and can be used as an in-memory database which makes setup trivial.

johnstok
+1  A: 

PostgreSql is a free database engine which is well scalable and widely used, besides MySQL.

hayalci
A: 

I would recommend MySQL. It has a lot of built-in string handling functions.

BoltBait
+2  A: 

You should consider using Lucene. It allows you to store large amounts of text and query them really fast. With good relevance matching too.

Cristian Libardo
A: 

cause you don't specify, why not SQL Server 2008 Express Edition?

All of the search features with the big brother SQL Server 2008 with the only problem that you can't exceed 4Gb of Database.

balexandre