Hello. Suppose I have such chunk of a sentence:
(NP
(NP (DT A) (JJ single) (NN page))
(PP (IN in)
(NP (DT a) (NN wiki) (NN website))))
At a certain moment of time I have a reference to (JJ single) and I want to get the NP node binding A single page. If I get it right, that NP is the parent of the node, A and page a...
Hi,
I need to classify sentences as a RDF format.
In other words "John likes coke" would be automatically represented as
Subject : John
Predicate : Likes
Object : Coke
does nyone know where I should start? Are there any programs which can do this automatically or would I need to do everything from scratch?
Any help would be appreci...
Hello!
Basically I want to find a path between two NP tokens in the dependencies graph. However, I can't seem to find a good way to do this in the Stanford Parser. Any help?
Thank You Very Much
...
Hello, I'm doing a university project, that must gather and combine data on a user provided topic. The problem I've encountered is that Google search results for many terms are polluted with low quality autogenerated pages and if I use them, I can end up with wrong facts. How is it possible to estimate the quality/trustworthiness of a pa...
Do you know of any existing implementation in any language (preferably python) of any entity set expansion algorithms, such that the one from Google sets ? ( http://labs.google.com/sets )
I couldn't find any library implementing such algorithms and I'd like to play with some of those to see how they would perform on some specific task I...
Hello,
I have already asked a similar question earlier but I have notcied that I have big constrain: I am working on small text sets suchs as user Tweets to generate tags(keywords).
And it seems like the accepted suggestion ( point-wise mutual information algorithm) is meant to work on bigger documents.
With this constrain(working on ...
I am trying to implement a naive bayseian approach to find the topic of a given document or stream of words. Is there are Naive Bayesian approach that i might be able to look up for this ?
Also, i am trying to improve my dictionary as i go along. Initially, i have a bunch of words that map to a topics (hard-coded). Depending on the occ...
Hi,
I'm doing a project for a college class I'm taking.
I'm using PHP to build a simple web app that classify tweets as "positive" (or happy) and "negative" (or sad) based on a set of dictionaries. The algorithm I'm thinking of right now is Naive Bayes classifier or decision tree.
However, I can't find any PHP library that helps me do...
Hat in hand here. I'm a seasoned developer and I would be grateful for a bit of help. I don't have time to read or digest long intricate discussions on theoretical concepts around NLP (or go get my PHD). That said, I have read a few and it's a damn interesting field. The problem is I need real world solutions, for real world products, in...
Hi folks,
I am working on one feature i.e. to apply language segmentation rules ( grammatical ) for Latin based language ( English currently ).
Currently I am in phase of breaking sentences of user input.
e.g.:
"I am working in language translation". "I have used Google MT API for this"
In above example i will break above sentence ...
Hello,
I have a set of Books objects, classs Book is defined as following :
Class Book{
String title;
ArrayList<tags> taglist;
}
Where title is the title of the book, example : Javascript for dummies.
and taglist is a list of tags for our example : Javascript, jquery, "web dev", ..
As I said a have a set of books talking about di...
When I ask a question here, the tool tips for the question returned by the auto search given the first little bit of the question, but a decent percentage of them don't give any text that is any more useful for understanding the question than the title. Does anyone have an idea about how to make a filter to trim out useless bits of a que...
The problem: Given a set of hand categorized strings (or a set of ordered vectors of strings) generate a categorize function to categorize more input. In my case, that data (or most of it) is not natural language.
The question: are there any tools out there that will do that? I'm thinking of some kind of reasonably polished, download, ...
I am embarking upon a NLP project for sentiment analysis.
I have successfully installed NLTK for python (seems like a great piece of software for this). However,I am having trouble understanding how it can be used to accomplish my task.
Here is my task:
I start with one long piece of data (lets say several hundred tweets on the subje...
Hi,
Is there a way to obtain Wordnet adjective nominalizations using NLTK?
For example, for 'happy' the desired output would be 'happiness'.
I tried to dig around, but couldn't find anything.
Thanks!
...
I know this is a long shot, but does anyone know of a dataset of English words that has stress information by syllable? Something as simple as the following would be fantastic:
AARD vark
A ble
a BOUT
ac COUNT
AC id
ad DIC tion
ad VERT ise ment
...
Thanks in advance!
...
I have a large dataset (c. 40G) that I want to use for some NLP (largely embarrassingly parallel) over a couple of computers in the lab, to which i do not have root access, and only 1G of user space.
I experimented with hadoop, but of course this was dead in the water-- the data is stored on an external usb hard drive, and i cant load it...
I'm working on a project at the moment where it would be really useful to be able to detect when a certain topic/idea is mentioned in a body of text. For instance, if the text contained:
Maybe if you tell me a little more about who Mr Jones is, that would help. It would also be useful if I could have a description of his appearance, ...
I have over 1000 surveys, many of which contains open-ended replies.
I would like to be able to 'parse' in all the words and get a ranking of the most used words (disregarding common words) to spot a trend.
How can I do this? Is there a program I can use?
EDIT If a 3rd party solution is not available, it would be great if we can keep...
I have a data set with multiple layers of annotation over the underlying text, such as part-of-tags, chunks from a shallow parser, name entities, and others from various natural language processing (NLP) tools. For a sentence like The man went to the store, the annotations might look like:
Word POS Chunk NER
==== === ===== ...