so I have a huge collection of PDF files that I need to extract text from.
The files are encrypted, but I know the password for them. I'm looking for a way to automate the process of extracting the text.
I can manually open the file in Acrobat professional, remove security by typing in the password, and then save as .txt file. But there'...
All the PHP files in my workspace are encoded in Unicode (UTF-8, no BOM). I often duplicate an existing source file to use as a base for a new script. Invariably (with Path Finder or the original Finder), OS X will convert the encoding of the duplicate file to Western (Mac OS Roman).
Is there any way to make OS X behave and not convert ...
Some thing ago, I write small script using Text::DeDupe to remove duplicates of blog posts before I have to lay my eyes on them.
After reading Syntactic Clustering of the Web paper on which implementation is based, I would love to have ability to find overlapping documents (e.g. snippets of blogs as opposed to full text, maybe also quot...
Here's what I am trying to do:
Select text from a webpage I pulled up using my web browser control.
After clicking a button while this text is still selected I would like a message box
to pop-up displaying the text that was highlighted by the user.
I do I get this functionality to work in my wpf application?
I think I'm on the right tra...
I'm writing a utility to export evernote notes into Outlook on a schedule. The Outlook API's need plain text, and Evernote outputs a XHTML doc version of the plain text note. What I need is to strip out all the Tags and unescape the source XHTML doc embedded in the Evernote export file.
Basically I need to turn;
<note>
<title>Test Sy...
We're doing a lot of large, but straightforward forms for a fairly big project (about 600 users using it throughout the day - that's big for me at least ;-) ).
The forms have a lot of question/answer type sections, so it's natural for some people to type a sentence, while others type a novel. How beneficial would it be to put a charact...
Given any arbitrary, one-line string, my goal is to render it into a bitmap representation. However, I have no means of finding out its dimensions beforehand, so I am reduced to getting the glyph range's bounding rect and resizing my canvas if it's not large enough. Unfortunately, if the canvas is not wide enough for the string, but ta...
Hiya,
I have an XML file, and I want to find nodes that have duplicate CDATA. Are there any tools that exist that can help me do this?
I'd be fine with a tool that does this generally for text documents.
...
I'm looking for a good JavaScript RegEx to convert names to proper cases. For example:
John SMITH = John Smith
Mary O'SMITH = Mary O'Smith
E.t MCHYPHEN-SMITH = E.T McHyphen-Smith
John Middlename SMITH = John Middlename SMITH
Well you get the idea.
Anyone come up with a comprehensive solution?
...
Consider the following file
var1 var2 variable3
1 2 3
11 22 33
I would like to load the numbers into a matrix, and the column titles into a variable that would be equivalent to:
variable_names = char('var1', 'var2', 'variable3');
I don't mind to split the names and the numbers in two files, however preparing matlab code...
Hi all,
I've got a text file that contains several 'records' inside of it. Each record contains a name and a collection of numbers as data.
I'm trying to build a class that will read through the file, present only the names of all the records, and then allow the user to select which record data he/she wants.
The first time I go thro...
Let's say you have a text file like this one:
http://www.gutenberg.org/files/17921/17921-8.txt
Does anyone has a good algorithm, or open-source code, to extract words from a text file?
How to get all the words, while avoiding special characters, and keeping things like "it's", etc...
I'm working in Java.
Thanks
...
Sometimes you have strings that must fit within a certain pixel width. This function attempts to do so efficiently. Please post your suggestions or refactorings below :)
function fitStringToSize(str,len) {
var shortStr = str;
var f = document.createElement("span");
f.style.display = 'hidden';
f.style.padding = '0px';
...
I'm looking for an alternative, since I find emacs difficult to use. I'd rather use an editor that supports all the usual shortcuts I'm used to, such as arrow keys to move the cursor around, CTRL+SHIFT+RightArrow to select the next word, etc.
Basically, I don't want to have to relearn all my familiar shortcuts just so I can use emacs.
...
I have this code:
myVariable
which I want to change into
trace("myVariable: " + myVariable);
using a direct hotkey like "alt-f12" to do it. I.e not using "ctrl-space" and arrow buttons.
is it possible in eclipse?
...
Hello,
I have a little c# console program that outputs some text using Console.WriteLine. I then pipe this output into a textfile like:
c:myprogram > textfile.txt
However, the file is always an ansi text file, even when I start cmd with the /u switch.
cmd /? says about the /u switch:
/U Causes the output of internal
commands...
I am writing a silly little app in C++ to test one of my libraries. I would like the app to display a list of commands to the user, allow the user to type a command, and then execute the action associated with that command. Sounds simple enough.
In C# I would end up writing a list/map of commands like so:
class MenuItem
{
...
Hi,
How do I accomplish text wrapping of table fields in SSRS Report, and proper landscaping when rendering the report to PDF format
Thanks in advance
Anna
...
I have HTML that includes symbols such as the Trademark "TM" as superscript (). In normal HTML, I would use "™" or ™ to display the Trademark TM. However, I can find no way to import HTML like this into Flex and have it displayed correctly. I am having similar issues with the <li> tag.
My HTML:
<p>This information is intelle...
I have some HTML I am trying to parse. There are cases where the html attributes alone are not going to help me identify the row type (header versus data). Fortunately, if my row is a data row then it should have some values that can be converted to integers. I have figured out how to convert the unicode to an integer for those cases ...