natural-language

English Grammar Parsing in PHP (Link Grammar)

Is there anyway to use the Link Grammar or AbiSource grammar checker in PHP (or C# but I'd prefer php)? I need to have a tree structure for english sentences. Any ideas? The only things I found were in C and I can't use them on a shared host. ...

how to create exclamations for a particular sentence

I would like to create exclamations for a particular sentence using the java API? e.g. It's surprising == Isn't it surprising! e.g. It's cold == Isn't it cold! Are there any vendors or tools which help you generate exclamations, provided you give a sentence (i.e. the left hand side in the above example). Note: The sentences will be p...

Text mining: when to use parser, tagger, NER tool?

I'm doing a project on mining blog contents and I need help differentiating on which tool to uses. When do I use a parser, when do I use a tagger, and when do I need to use a NER tool? For instance, I want to find out the most talked about topics/subjects between several blogs; do I use a part-of-speech tagger to grab the nouns and do a...

Verbally format a number in Python

How do pythonistas print a number as words, like the equivalent of the Common Lisp code: [3]> (format t "~r" 1e25) nine septillion, nine hundred and ninety-nine sextillion, nine hundred and ninety-nine quintillion, seven hundred and seventy-eight quadrillion, one hundred and ninety-six trillion, three hundred and eight billion, three hu...

Unstructured Text to Structured Data

I am looking for references (tutorials, books, academic literature) concerning structuring unstructured text in a manner similar to the google calendar quick add button. I understand this may come under the NLP category, but I am interested only in the process of going from something like "Levi jeans size 32 A0b293" to: Brand: Levi, Si...

Probabilistic Generation of Semantic Networks

I've studied some simple semantic network implementations and basic techniques for parsing natural language. However, I haven't seen many projects that try and bridge the gap between the two. For example, consider the dialog: "the man has a hat" "he has a coat" "what does he have?" => "a hat and coat" A simple semantic network, based...

How to detect language of user entered text?

I am dealing with an application that is accepting user input in different languages (currently 3 languages fixed). The requirement is that users can enter text and dont bother to select the language via a provided checkbox in the UI. Is there an existing Java library to detect the language of a text? I want something like this: text ...

Is there something like automatic writing or surrealist automatism in programming?

Is there something like automatic writing or surrealist automatism in programming? ...

Is similarity to "natural language" a convincing selling point for a programming language?

Look, for example at AppleScript (and there are plenty of others, some admittedly quite good) which advertise their use of the natural language metaphor. Code is apparently more readable because it can be/is intended to be constructed in English-like sentences, says they. I'm sure there are people who would like nothing better than to pr...

Has anyone parsed Wiktionary?

Wiktionary is a wiki dicitonary that covers many languages. It even has translations. I'd be interested in parsing it and playing with the data, has anyone does anything like this before? Is there any library I can use? (Preferable Python) ...

Using ChunkedCorpusReader in nltk

Hello, can someone please post some kind of example for using this to read a file. for example: the fox{{some_tag}} jumped over the((some_other_tag)) lazy dog. the api for it is in: http://nltk.googlecode.com/svn/trunk/doc/api/nltk.corpus.reader.chunked.ChunkedCorpusReader-class.html i can get it to read files and split them over com...

How would you group up articles by context? - Natural Language

Hi folks, I have lists of articles made of: title, subtitle and body. Now I need to parse all these articles and group them up under different context categories or sub categories based on their possible keywords. e.g. if the article is likely to be related to sports cars then the article would be associated with the car or/and veh...

convert 2010-04-15 23:59:59 to 15th Apr 2010

Hi, I have the following date format: 2010-04-15 23:59:59 How would I go about converting this into: 15th Apr 2010 using javascript ...

Where can I find get a dump of raw text on the web?

I am looking to do some text analysis in a program I am writing. I am looking for alternate sources of text in its raw form similar to what is provided in the Wikipedia dumps (download.wikimedia.com). I'd rather not have to go through the trouble of crawling websites, trying to parse the html , extracting text etc.. ...

Plural of words using Open Office API for Python (UNO)

I would like to retrieve the plural words in different languages in Python. I know that openoffice has an API called uno (import uno) and it should give me this ability using openoffice's language dictionaries, but I could not find any reference to it. As a concrete example, I would something like this: >>> print getPluralOf('table') ...

Voice Form Matching in Visual C++

Are there SDK's for voice-form matching / comparison for Visual C++? Or, possibly converting sounds to phonetics. Usage: Program will do different things from input from certain command words given in a made-up foreign language. (Klingon) Analysis - comparison of user's voice with existing pre-recorded voice segment Rather than using...

Can I use NLTK to determine if a comment is a positive one or a negative one?

Can you show me a simple example using http://www.nltk.org/code to determine if a string about a happy or upset mood? ...

How to split a string into words. Ex: "stringintowords" -> "String Into Words" ?

What is the right way to split a string into words ? (string doesn't contain any spaces or punctuation marks) For example: "stringintowords" -> "String Into Words" Could you please advise what algorithm should be used here ? ! Update: For those who think this question is just for curiosity. This algorithm could be used to camеlcase do...

Tools for getting intent from Twitter statuses?

I am considering a project in which a publication's content is augmented by relevant, publicly available tweets from people in the area. But how could I programmatically find the relevant Tweets? I know that generating a structure representing the meaning of natural language is pretty much the holy grail of NLP, but perhaps there's some ...

Break/Decompose complex and compound sentences in nltk

Is there a way to decompose complex sentences into simple sentences in nltk or other natural language processing libraries? For example: The park is so wonderful when the sun is setting and a cool breeze is blowing ==> The sun is setting. a cool breeze is blowing. The park is so wonderful. ...