words

Best technical shibboleths and keywords

In natural language research and anthropology there is the construction known as a Shibboleth. Specifically this is defined as when your pronunciation of a word gives away your cultural background. This isn't just your favourite piece of hi-tech argot, but that favourite thing that lusers will say wrong and you'll know them for the n00...

How can I split multiple joined words?

I have an array of 1000 or so entries, with examples below: wickedweather liquidweather driveourtrucks gocompact slimprojector I would like to be able to split these into their respective words, as: wicked weather liquid weather drive our trucks go compact slim projector I was hoping a regular expression my do the trick. But, sinc...

Counting the number of occurrences of words in a textfile

How could I go about keeping track of the number of times a word appears in a textfile? I would like to do this for every word. For example, if the input is something like: "the man said hi to the boy." Each of "man said hi to boy" would have an occurrence of 1. "the" would have an occurence of 2. I was thinking of keeping a diction...

Natural English language words

I need the most exhaustive English word list I can find for several types of language processing operations, but I could not find anything on the internet that has good enough quality. There are 1,000,000 words in the English language including foreign and/or technical words. Can you please suggest me such a source (or close to 500k w...

Counting lines, words, characters and top ten words?

Hi I'm pretty new to Stack Overflow so I hope that I'm doing this correctly and that someone out there has the answer I need. I'm currently coding a program in Java with Eclipse IDE an my question is this: I need a snippet of code that does the following It's supposed to get a .TXT file containing text and from that .TXT file count t...

Putting spaces back into a string of text with unreliable space information

I need to parse some text from pdfs but the pdf formatting results in extremely unreliable spacing. The result is that I have to ignore the spaces and have a continuous stream of non-space characters. Any suggestions on how to parse the string and put spaces back into the string by guessing? I'm using ruby. Or should I say I'musingrub...

Detecting misspelled words

I have a list of airport names and my users have the possibility to enter one airport name to select it for futher processing. How would you handle misspelled names and present a list of suggestions? ...

How do I do word Stemming or Lemmatization?

I've tried PorterStemmer and Snowball but both don't work on all words, missing some very common ones. My test words are: "cats running ran cactus cactuses community communities", and both get less than half right. Ideally the class/function would be in PHP, but I can port it if it's in another language. See also: Stemming algorith...

Is there a ReadWord() method in the .NET Framework?

I'd hate to reinvent something that was already written, so I'm wondering if there is a ReadWord() function somewhere in the .NET Framework that extracts words based some text delimited by white space and line breaks. If not, do you have a implementation that you'd like to share? string data = "Four score and seven years ago"; List<st...

Number to words in Rave Report

I have managed to design a report the way I want but I am not able to get Rave Report to print/convert Grand total to Words like for example if the grand total is 1,200.00 it should print One Thousand and Two Hundred only. Is something like this possible in Rave Report? ...

Python strings split with multiple separators

Weird - I think what I want to do is a fairly common task but I've found no reference on the web. I have text, with punctuation, and I want an array of the words. i.e - "Hey, you - what are you doing here!?" should be ['hey', 'you', 'what', 'are', 'you', 'doing', 'here']. But python's split() only works with one argument... so I have all...

extract words from a file

I'm trying to create a dictionary of words from a collection of files. Is there a simple way to print all the words in a file, one per line? ...

PHP: count uppercase words in string

Hi, is there an easy way to count uppercase words within a string? ...

jquery reserved words

Is there a list of jQuery reserved words published somewhere? I ask because jQuery won't return a value for a class I'm using called "selected". If I change the class name to something else it is found. Example: <ul> <li><a id="a1" class="selected" href="#tab1">Part I</a></li> </ul> alert($('ul li a').attr("class")); I get an em...

PHP Limit string output by specific characters

Hello, I am trying to limit the number of characters returned from a string using PHP. I've applied a solution that just seemed to crash the server (high load) / infinite loop. So I am asking for alternative, Simply, I am trying to find a solution that cuts the string, display specific amount of characters, but still respect the meanin...

AppleScript Word Count Service

Hi folks, I am trying to create a service in OSX leopard that counts the number of words of selected text. I have automator set to run an applescript, with the following put in it: on run {input, parameters} count words of input display alert "Words: " & input return input end run When I compile the script, it...

How do I search for a 'blank tile' in a scrabble application? (PHP)

I created this application a couple of months ago: http://www.mondofacto.com/word-tools/scrabble-solver.html The application lets the user enter the set of letters they are given, and echos back what valid words they can use, along with what score they will get for using those letters. Basically, what I want to do is extend the applica...

Where to get a list of almost all the words in English language?

I want to get some random text generated. I tried writing a basic Java programme, int nowords = r.nextInt(2000); int i, j; for (i = 0; i < nowords; i++) { int lengthofword = r.nextInt(10) + 2; for (j = 0; j < lengthofword; j++) { int ch = r.nextInt(26); System.out...

PHP: word definition script?

I am developing a web page in which I am accepting input words from user and when user will submit those words then I want to display definition of those words or wikipedia link of those words for more definition about that word. Something like below: Let's say user enetered 5 words: toast, egg, beans, coffee, tea Now I want to displ...

CKeditor character/word count

I need to display character count and word count with the new CKeditor. I tried to search for a plugin but there are none, except few hacks for the old fckeditor. ...