Duplicate:
I have to parse hundreds of text files per second, each file containing multi subject text (consider, for example, it's email text). I need to find various patterns (keywords, sentences, most important words and stuff like that). I need to know what is the fastest programming language to do that.
Note: I tried php and perl to find keywords using regular expressions. Is there any faster way to do that? and to get most important words and analyse the semantic of sentences what should i use?
I do have a list of keywords stored in a text file (probably will be in a ldap directory later). Example: " You've just registered in facebook with these stats: username: user1 password: passwdofuser1 "
I have to tag this text for containing words like "password" and "username" and retrieve user and password information to process later.
Sentence example: "Let's meet tomorrow at 5pm in st. john's restaurant." i have to get important information like "tomorrow" "5pm" and "st john's restaurant" to process it later.
thanks