text

Libraries or tools for generating random but realistic text

I'm looking for tools for generating random but realistic text. I've implemented a Markov Chain text generator myself and while the results were promising, my attempts at improving them haven't yielded any great successes. I'd be happy with tools that consume a corpus or that operate based on a context-sensitive or context-free grammar...

python and pyPdf - how to extract text from the pages so that there are spaces between lines

currently, if I make a page object of a pdf page with pyPdf, and extractText(), what happens is that lines are concatenated together. For example, if line 1 of the page says "hello" and line 2 says "world" the resulting text returned from extractText() is "helloworld" instead of "hello world." Does anyone know how to fix this, or have su...

Options on display of text at 45 degree angle in browser

I have a requirement to display text at a 45 degree angle in the browser. The text is selected via ajax calls from a selection of a lot of values. Text varies in length up to 100 characters. Need to display the first part and the last few characters (I can figure this part out). For example: "This is the text and it can be quite long......

Import txt to sql database via linq

i need to import a large tab deliminated text file with a lot of columns (over 50 columns) i would like to write a c# script that creates the table based on the header of the text file. assume all fields are nvarchar(1000) i cannot use any program such as sql data import wizard. ...

itext multiline text in bounding box

Hi. Does anyone know, how to, in iText, add multiline text in bounding box (with coordinates specified). I tried cb.showTextAligned(PdfContentByte.ALIGN_LEFT, text, bounds.getLeft(), TOTAL_HEIGHT-bounds.getTop(), 0); But it does not support newlines. I also tried PdfContentByte cb = writer.getDirectContent(); cb.moveText(300,40...

PDf to String in Java

What is the easiest way to get the text (words) or a PDF doc as a one long String or array of Strings. I have tried pdfbox but that is not working for me. ...

Multiple text areas with different rotation values causes borders to be very very wrong

If you have two textareas, one has a rotation value besides 0 and the other has no rotation value or a value of 0 and you 'tab' focus from the one w/rotation to the one w/out. The border around the textArea w/out rotation will be rotated. If you set the rotation value of the non-rotated text field to a non-zero number, even 0.01, it fi...

AJAX textarea blocked for writing when processing request

I have a textarea where people can write comments and click a button to post them. The processing is done with AJAX and so I want that as long as the server is processing the user request (and after too), the button and textarea will be blocked for editing/writing/clicking. It's very similar to how comments on Youtube videos work. Do y...

/ in vi Search and replace?

in vi, search and replace, how do you escape a '/' (forward slash) so that it is correct. Say in a path. like: /Users/tom/documents/pdfs/ :%s//Users/tom/documents/pdfs//<new text>/g --FAILS (obviously) :%s/\/Users/tom/documents/pdfs\//<new text>/g -- FAILS with a trailing error :%s/'/Users/tom/documents/pdfs/'/<new text>/g -- FAILS ...

Text processing / comparison engine

Hi, I'm looking to compare two documents to determine what percentage of their text matches based on keywords. To do this I could easily chop them into a set word of sanitised words and compare, but I would like something a bit smarter, something that can match words based on their root, ie. even if their tense or plurality is differen...

rich text editor in c sharp

hi. I want to use a rich text editor like TinyMCE in my windows aplication. Is there any solution for this? ...

Numbering in jQuery

How could I change the text below so that the text within it has a number appended to it. <div class="right">This is some text</div> <div class="right">This is some text</div> <div class="right">This is some text</div> So the code above would become, This is some text This is some text This is some text ...

How do I wrap long lines of text in a Java TextBox?

Hello! I want to load a text box in Java from a text file. This sounds simple but the big question is how to return at the end add newlines when text get close to the edge of the box, for example. | | | Java java java Java java java |Java java java...

Tools that automate merging of text files?

I have trouble where, for some reason, SVN would only merge the newly generated template code to implemented code (thus overwriting whatever I had done), but not the other way around. For example, 1) I generate a file called SomeFile.java. I commit this to trunk. I also branch this to feat1/SomeFile.java 2) I work off of the feat1/Som...

MySQL check how many times a word appears inside a table cell

I want to write a query to select from a table all rows with the word "piggy" in a column called Description. SELECT * FROM table WHERE ...? Thank you! ...

MacVim set as default text editor: How to set files to open in a new tab as opposed to a new window?

I've set MacVim as my default text editor, and when I double click files it opens up a new window. Is there a way to set it to open up in a new tab instead? ...

How to make words into a category. (NLP)

I love to eat chicken. Today I went running, swimming and played basketball. My objective is to return FOOD and SPORTS just by analyzing these two sentences. How can you do that? I am familiar with NLP and Wordnet. But is there something more high-level/practical/modern technology?? Is there anything that automatically categorizes w...

Does WordNet have "levels"? (NLP)

For example... Chicken is an animal. Burrito is a food. WordNet allows you to do "is-a"...the hiearchy feature. However, how do I know when to stop travelling up the tree? I want a LEVEL. That is consistent. For example, if presented with a bunch of words, I want wordNet to categorize all of them, but at a certain level, so it doesn'...

sed stream editor unix linux command : how to keep retain a paragraph with a particular text string

Hi I have managed to put text in a file by separating them by blank lines. I am trying to keep only those paragraphs that have a particular string. Though the Sed FAQ mentions a solution it does not work (see examples below) http://www.catonmat.net/blog/sed-one-liners-explained-part-two/ 58. Print a paragraph that contains “AAA”. (P...

Text Pattern Processing in paragraph with unix linux utilities

I have a file with the following pattern (please note this is a file generated using sed, awk, grep etc processing). The part of file input is as follows. filename1, BASE=a/b/c CONFIG=$BASE/d propertiesfile1=$CONFIG/e.properties EndOfFilefilename1 filename2, BASE=f/g/h CONFIG=$BASE/i propertiesfile1=$CONFIG/j.properties EndOfFilef...