tags:

views:

110

answers:

4

Why does this code eat up memory? When I run it it slowly consumes more memory with every loop, and I have something like 300000 loops. I'm using Windows, and Python 2.6.

def LoadVotes(self):
    old_votes=Votes.objects.all()
    amount=old_votes.count()
    print 'Amount of votes is: ' + str(amount)
    c=0
    for row in old_votes:
        try:
            new_id_user=LegacyUserId.objects.get(legacy_id=row._login)
        except LegacyUserId.DoesNotExist:
            string=" user with old id "+str(row._login)+" does not match new user id /n"
            log=open('log_add_old_votes.txt','a')
            log.write(string)
            continue
        try:
            new_id_media=LegacyMedia.objects.get(legacy_id=row.media_file_id)
        except new_id_media.DoesNotExist:
            log_text='old media with ID:'+str(row.media_file_id)+' is not found in relation with new media \n'
            log=open('log_add_old_votes.txt','a')
            log.write(log_text)
            continue
        mo=MediaObject.objects.get(pk=new_id_media.object_id)
        new_votes_item, created=Mark.objects.get_or_create(user=new_id_user.user, media_object=mo, defaults={'mark':int(row.rate)*2}) 
        c=c+1
        i=amount-c
        print '\rRemain:',
        stdout.write("%d" % i)
        stdout.flush()
A: 

Presumably because it's loading objects for every Vote in your database, and then iterating through those votes and loading LegacyUserIds for each one, and LegacyMedia objects for each one.

If the amount of data you have is large, or if these objects are large this will take a lot of memory.

I wouldn't be suprised if LegacyMedia was a pretty big object itself.

John Weldon
YEs but every loop it must clear variables. Instead, it is decreeing my memory. Maybe it becose i'm not closing log file and every loop it opening file again, and take mor memory for it?I did not use log.close()
Pol
+1  A: 

You are never closing the files you open. You should be doing file access like this

with open('log_add_old_votes.txt','a') as log:
    log.write(string)

This will automatically close the file object for you once you are done with it. You are also using the same file for each log message, so you could move the open to before the loop and use the same file object until you finish.

unholysampler
+5  A: 

If you run with DEBUG=True, django is storing all the queries in memory. Try changing to DEBUG=False in your settings.py file.

liori
You right Debug=True
Pol
Thanks! You were right. It become of debug=try
Pol
+2  A: 

I'm not sure what the Vote model looks like. But you're only interested in two attributes from Vote (_login and media_file_id). So you might consider using the values or values_list queryset API instead -- this way you only select the fields you need, and you don't create an object for each row.

Also, depending on how many more Votes you have than LegacyUserId or LegacyMedia rows, if you have a foreign key, you might just consider selecting those rows directly through a join, rather than iterating through votes and then issuing new queries when the id's exist.

Finally, this won't affect memory as much, but consider using python logging instead of the current method. (Or at least open the file once at the start of the function instead of every time you need to write.)

ars
I have no relation y my databases (no foreign keys). That why i'm making instance of every model i need.You right. I changed my code. Now I'm studding how to use logging.
Pol
The problem was whith DEBUG=True. But anyway thank you.
Pol