hey guys..
I am working on a project where I need to read some generic text...I am looking for any api by I can read generic text and also can convert it to .csv file...
Can any one plz help...
using java on windows os...
--------------------------MORE Detail------------------------------------------------------------------------------...
Is there an OSS which can compress a text to a synopsis?
My goal is to build an editor for SciFi novels which can either automatically create a synopsizes for chapters or at least make a suggestion for one.
...
I want to edit the following text so that every line begins with Dealer:. This means no wrapping/line breaks. For lines starting with System, wrapping is fine.
What would a solution in ruby look like? Thanks
This is located in a .txt file
Dealer: 5 seconds left to act
Dealer: hitman2714 wins the pot (9)
Dealer: Hand #1684326626D
Deale...
Hi,
do you know about an effective method for extracting key sentences from a text with their frequency parameters, etc and that can also do "stemming" (search also for similar sentences) ?
I wonder also if there is some software implementation
Thanks a lot
...
We keep track of user agent strings in our website. I want to do some statistics on them, to see how many IE6 users we have ( so we know what we have to develop against), and also how many mobile users we have.
So we have log entires like this:
Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; FunWebProducts)
Mozilla/4.0 (compatible; ...
I've written a Ruby script that is reading a file (File.read()) that contains unicode characters, and it works fine from the command line.
However, when I try to put it into an Automator Workflow (Mac OS X), I get this error;
2009-12-23 17:55:15 -0500: /Users/jeffreyaylesworth/bin/symbols:19:in `split': invalid byte sequence in US-ASCI...
Hi,
Does anybody know an open-source\free library that does term clustering?
Thanks,
yaniv
...
Hi everyone,
I have a file where i have multiple records with such data
F00DY4302B8JRQ rank=0000030 x=800.0 y=1412.0 length=89
now i want to search for the line where if i find length<=50 then delete this line and the next line in the file and write to another file.
Thanks everyone
...
How can I split a large text file into separate files by character count using PHP? So a 10,000 character file split every 1000 characters would be split into 10 files. Further, can I split only after a full stop is found?
Thanks.
UPDATE 1: I like zombats code and I removed some errors and have come up with the following, but does anyo...
I'm trying to open a text file and output its contents with the code below. The text file includes line breaks but when I echo the file its unformatted. How do I fix this?
Thanks.
<html>
<head>
</head>
<body>
$fh = fopen("filename.txt", 'r');
$pageText = fread($fh, 25000);
echo $pageText;
</body>
</html...
Hello,
I've got a list containing positions of text additions and deletions, like this:
Type Position Text/Length
1. + 2 ab // 'ab' was added at position 2
2. + 1 cde // 'cde' was added at position 1
3. - 4 1 // a character was deleted at position 4
T...
I have an input data with three columns (tab separated) like this:
a mrna_185598_SGL 463
b mrna_9210_DLT 463
c mrna_9210_IND 463
d mrna_9210_INS 463
e mrna_9210_SGL 463
How can I use sed/awk to modify it into
four columns data that looks like this:
a mrna_185598 SGL 463
b mrna_9210 DLT 463
c mrna_9210 ...
Hi all,
I am currently trying to adjust the values stored in a table in MySQL. The values stored contain a series of Unicode characters. I need to truncate to 40 bytes worth of storage, but when I try:
UPDATE `MYTABLE` SET `MYCOLUMN` = LEFT(`MYCOLUMN`, 40);
MySQL is overly helpful and retains 40 characters, rather than 40 bytes. Is t...
I have a file containing a large number of occurrences of the string Guid="GUID HERE" (where GUID HERE is a unique GUID at each occurrence) and I want to replace every existing GUID with a new unique GUID.
This is on a Windows development machine, so I can generate unique GUIDs with uuidgen.exe (which produces a GUID on stdout every tim...
I want to learn Sed. Can you please point me to good references so that I can fully utilize it.
I want to learn it to perform more of the do-once-then-forget type administrative or dev-tools like tasks. So, I don't really care about performance or modularity or object orientedness etc when writing this type of code. Do you think it wou...
When you search something in Stackoverflow it cuts the portion of the question description that best matches your criteria and after that it marks the criteria words.
I wonder the best way to do this manually in C#, meaning without the help of a full-text search engine.
The main problem is how to select the best text portion in a fast ...
Hi, I have a longitudinal data set generated by a computer simulation that can be represented by the following tables ('var' are variables):
time subject var1 var2 var3
t1 subjectA ...
t2 subjectB ...
and
subject name
subjectA nameA
subjectB nameB
However, the file generated writes a data file in a format similar to the f...
I have a text file which is:
ABC 50
DEF 70
XYZ 20
DEF 100
MNP 60
ABC 30
I want an output which sums up individual values and shows them as a result. For example, total of all ABC values in the file are (50 + 30 = 80) and DEF is (100 + 70 = 170). So the output should sum up all unique 1st column names as -
ABC 80
DEF 170
XYZ 20
MNP 60...
In our project we need to import the csv file to postgres.
There are multiple types of files meaning the length of the file changes as some files are with fewer columns and some with all of them.
We need a fast way to import this file to postgres. I want to use COPY FROM of the postgres since the speed requirement of the processing are ...
This
sed "s/public \(.*\) get\(.*\)()/\1 \2/g"
will transfor this
public class ChallengeTO extends AbstractTransferObject {
public AuthAlgorithm getAlgorithm();
public long getCode();
public int getIteration();
public String getPublicKey();
public String getSelt();
};
into this
public class ...