Hi,
I'm trying to parse large XML files (>3GB) like this:
context = lxml.etree.iterparse(path)
for action,el in self.context:
# do sth. with el
With iterparse I thought the data is not completely loaded into RAM, but according to this article I'm wrong:
http://www.ibm.com/developerworks/xml/library/x-hiperfparse/ (see Listing 4)
Though when I apply this solution to my code, some elements are obviously cleared which have not been parsed so far (especially child-elements of el
).
Is there any other solution to this memory problem?
Thanks in advance!