I have a number of large (~100 Mb) files which I'm regularly processing. While I'm trying to delete unneeded data structures during processing, memory consumption is a bit too high. so, I was wondering is there a way to 'efficiently' manipulate large data, e.g.:
def read(self, filename):
fc = read_100_mb_file(filename)
self.process(fc)
def process(self, content):
# do some processing of file content
is there a duplication of data structures? isn't it more memory efficient to use class-wide variable like self.fc?
how to garbage-collect? I know about gc module, but do I call it after i del fc
for example? does garbage collector called after a del statement at all? when should I use garbage collection?
update
p.s. 100 Mb is not a problem in itself. but float conversion, further processing add significantly more to both working set and virtual size (i'm on windows).