Looking for help in doing this:
I have a directory full of text files that are named with a numerical ID. Each text file contains the body of a news article. Some news articles are segregated in different parts, so they are in different text files.
The names are such
1001_1.txt, 1001_2.txt (These files contain two different part of the same article) 1002_1.txt, 1003_1.txt, 1004_1.txt, 1004_2.txt, 1004_3.txt, 1004_4.txt (these files contain four different parts of the same article, the parts will go up to a maximum of 4 only).
and so forth and so on.
Basically, I need a script (PHP, Perl, RUBY or otherwise) that would simply put the name of the text file (before the underscore) in a column, and the content of the text file in another column, and if there is any number after the underscore, to put that in one column as well.
So you would have a table structure looking like this:
1001 | 1 | content of the text file
1001 | 2 | content of the text file
1002 | 1 | content of the text file
1003 | 1 | content of the text file
Any help on how I can accomplish this would be appreciated.
There are about 7000 text files that need to be read and imported in a table for future usage in a database.
It would be even better if the _1 and _2 files content could be segregated in different colums, eg:
1001 | 1 | content | 2 | content | 3 | content | 4 | content
1002 | 1 | content
1003 | 1 | content
(Like I said, the file names go maximum up to _4
so you could have 1001_1
, 1001_2
, 1001_3
, 1001_4.txt
or only 1002_1
and 1003_1.txt
)