I have a large file called fulldataset
. I would like to write lines from fulldataset to a new file called newdataset. I only want to write the lines from fulldataset though that contain the id numbers present in the listfile. Also all the id numbers start with XY. The id numbers occur in the middle of each line though.
Here is an example line from list file:
Robert, Brown, "XY-12344343", 1929232, 324934923,
Here is the program I have so far. It runs fine, but doesn't write anything into the new file.
datafile = open('C:\\listfile.txt', 'r')
completedataset = open('C:\\fulldataset.txt', 'r')
smallerdataset = open('C:\\newdataset.txt', 'w')
matchedLines = []
for line in datafile:
if line.find("XY"):
matchedLines.append( line )
counter = 1
for line in completedataset:
print counter
counter +=1
for t in matchedLines:
if t in line:
fulldataset.write(line)
del line
break
datafile.close()
completedataset.close()
fulldataset.close()
EDIT:
Ok here is the new program:
datafile = open('C:\\tryexcel33.txt', 'r')
completedataset = open('C:\\fulldataset.txt', 'r')
smallerdataset = open('C:\\newdataset.txt', 'w')
counter = 1
for line in completedataset:
print counter
counter +=1
if any( id in line for id in datafile ):
smallerdataset.write( line )
break
datafile.close()
completedataset.close()
fulldataset.close()
I still don't have anything being written to the new file. I think a problem might be that in the full file the id numbers have a " in front of them but this doesn't exist in the listfile. Any thoughts?