For a linguistics course we implemented Part of Speech (POS) tagging using a hidden markov model, where the hidden variables were the parts of speech. We trained the system on some tagged data, and then tested it and compared our results with the gold data.
Would it have been possible to train the HMM without the tagged training set?
...
I try to convert number to words but I have a problem:
>> (91.80).en.numwords
=> "ninety-one point eight"
I want it to be "ninety-one point eighty". I use Linguistics gem. Do you know some solution for it (prefer with Linguistics).
...
Here's an algorithm for adding an apostrophe to a given input noun.
How would you contruct a string to show ownership?
/**
* apostrophizes the string properly
* <pre>
* curtis = curtis'
* shaun = shaun's
* </pre>
*
* @param input string to apostrophize
* @return apostrophized string or empty string if the input was empty or nul...
Hi,
I'm designing architecture of a text parser. Example sentence: Content here, content here.
Whole sentence is a... sentence, that's obvious. The, quick etc are words; , and . are punctuation marks. But what are words and punctuation marks all together in general? Are they just symbols? I simply don't know how to name what a singl...
Hello I would like to know how to implement the solution to such a task:
There's a 500Mb file of plain English texts.
I'd like to collect the statistics about the frequency of words,
but additionally to be sure that each word is recognized correctly (or the majority of words).
In terms that 'cry' in the sentence "she gave a loud CRY" ...
What are books about how to build a natural language parsing program like this:
input: I got to TALL you
output: I got to TELL you
input: Big RAT box
output: Big RED box
in: hoo un thum zend three
out: one thousand three
It must have the language model that allows to predict what words are misspelled !
What are the best books on ...
Let's say you should monitor the brand "ONE" online. What algorithms can be used to separate pages about the brand ONE from pages containing the common word ONE?
I'm thinking maybe Bayes could work, but are there other ways to do this?
...
I'm making a boggle-like word game. The user is given a grid of letters like this:
O V Z W X
S T A C K
Y R F L Q
The user picks out a word using any adjacent chains of letters, like the word "STACK" across the middle line. The letters used are then replaced by the machine e.g. (new letters in lowercase):
O V Z W X
z e x o p
Y R F L Q...
Hi All,
Thanks for stoping to read my question :) this is very sweet place full of GREAT peoples !
I have a question about "creating sentences with words". NO NO it is not about english grammar :)
Let me explain, If I have bag of words like
"person apple apple person person a eat person will apple eat hungry apple hungry"
and it can...
Hi,
I'm not sure whats the best algorithm to use for the classification of relationships in words. For example in the case of a sentence such as "The yellow sun" there is a relationship between yellow and sun. THe machine learning techniques I have considered so far are Baynesian Statistics, Rough Sets, Fuzzy Logic, Hidden markov model ...
I am looking for word alignment tools and algorithms.
I am dealing with bilingual English - Hindi text, and currently working on
DTW (Dynamic Time Warping) algorithm
CLA (Competitive Linking Algorithm)
NATools
Giza++
Could you please suggest any other algorithm/tool which is language independent and which could achieve Statistical w...
Hello. Say I have a base form of a word and a tag from the Penn Treebank Tag Set. How can I get the conjugated form? For example for "do" and "VBN" how can I get "done"?
I thinks this task is already implemented in some nlp library, so I'd rather not invent the bicycle. Does something like that exist?
...
I was wondering if anyone was familiar with any attempts at algorithmic sentence negation.
For example, given a sentence like "This book is good" provide any number of alternative sentences meaning the opposite like "This book is not good" or even "This book is bad".
Obviously, accomplishing this with a high degree of accuracy would pr...
I'm tasked with searching for the use of cliches and common phrases in text. The phrases are similar to the phrases you might see for the phrase puzzles on Wheel of Fortune. Here are a few examples:
Easy Come Easy Go
Too Good To be True
Winning Isn't Everything
I cannot find a list of phrases however. Does anybody know of such a list...
Hello. Let's say there is a sentence:
On March 1, he was born.
Changing it to
He was born on March 1.
doesn't break the sense of the sentence and it is still valid. Shuffling words in any other way would produce weird to invalid sentences. So basically, I'm talking about parts of the sentence, which make the information more speci...
Hi,
I need to classify sentences as a RDF format.
In other words "John likes coke" would be automatically represented as
Subject : John
Predicate : Likes
Object : Coke
does nyone know where I should start? Are there any programs which can do this automatically or would I need to do everything from scratch?
Any help would be appreci...
I'd like to learn foundations of encodings, characters and text. Understanding these is important for dealing with a large set of text whether that are log files or text source for building algorithms for collective intelligence. My current knowledge is pretty basic: something like "As long as I use UTF-8, I'm okay."
I don't say I need ...
hi
i want to know how to get correct word from wrong one...
example
The string is "sstring"
but the correct word is string...
is any algorithm in php?
thanks and advance
...
Some languages, particularly Slavic languages, change the endings of people's names according to the grammatical context. (For those of you who know grammar or studied languages that do this to words, such as German or Russian, and to help with search keywords, I'm talking about noun declension.)
This is probably easiest with a set of e...
An interlinear gloss can be used to layout a translation of a document.
http://en.wikipedia.org/wiki/Interlinear_gloss
Usually this is done word-by-word or morpheme-by-morpheme. However, I would like to do this in a different way, translating entire paragraphs at a time. The following link and image is an example of what I want done,...