Is it possible to split a file? For example you have huge wordlist, I want to split it so that it becomes more than one file. How is this possible?
A:
Sure, just read in the file and write out some of the words to each different output file. It's possible to do this in any programming language.
David Zaslavsky
2009-02-13 16:08:19
A:
Easily. I'd suggest iterating over the file and writing to a new file as necessary, then deleting the original. This answer is fairly intuitive to me, though, so I'm not sure if it's insufficient, or if perhaps it needs more clarification.
Devin Jeanpierre
2009-02-13 16:08:38
+2
A:
Sure it's possible:
open input file
open output file 1
count = 0
for each line in file:
write to output file
count = count + 1
if count > maxlines:
close output file
open next output file
count = 0
Charlie Martin
2009-02-13 16:10:36
Don't forget to reset your count after opening the new file...
Sean Cavanagh
2009-02-13 16:20:29
right, or test count mod maxlines.
Charlie Martin
2009-02-13 19:05:58
+2
A:
This one splits a file up by newlines and writes it back out. You can change the delimiter easily. This can also handle uneven amounts as well, if you don't have a multiple of splitLen lines (20 in this example) in your input file.
splitLen = 20 # 20 lines per file
outputBase = 'output' # output.1.txt, output.2.txt, etc.
# This is shorthand and not friendly with memory
# on very large files (Sean Cavanagh), but it works.
input = open('input.txt', 'r').read().split('\n')
at = 1
for lines in range(0, len(input), splitLen):
# First, get the list slice
outputData = input[lines:lines+splitLen]
# Now open the output file, join the new slice with newlines
# and write it out. Then close the file.
output = open(outputBase + str(at) + '.txt', 'w')
output.write('\n'.join(outputData))
output.close()
# Increment the counter
at += 1
sli
2009-02-13 16:17:41
Might mention that for REALLY BIG FILES, open().read() chews a lot of memory and time. But mostly it's okay.
Sean Cavanagh
2009-02-13 16:21:32
Oh, I know. I just wanted to throw together a working script quickly, and I normally work with small files. I end up with shorthand like that.
sli
2009-02-15 21:06:34
+1
A:
Is this a duplicate? See: http://stackoverflow.com/questions/291740/how-do-i-split-a-huge-text-file-in-python
quamrana
2009-02-13 17:00:02