views:

229

answers:

3

Hi,

I have a filter that I use for lang support in my webapp. But when I publish it to gae it keeps telling me that it the usage of CPU is to high.

I think I located the problem to my filters I use for support. I use this in my templates:

<h1>{{ "collection.header"|translate:lang }}</h1>

The filter code looks like this:

import re
from google.appengine.ext import webapp
from util import dictionary

register = webapp.template.create_template_register()

def translate(key, lang):
    d = dictionary.GetDictionaryKey(lang, key)
    if d == False:
        return "no key for " + key
    else: 
        return d.value

register.filter(translate)

I'm to new to Python to see what's wrong with it. Or is the the entire wrong approach?

..fredrik

A: 

Little more about what I'm trying to do: I'm trying to find away to handle language support. A user needs to be able to update text elements via an admin page. As of now I have all text elements stored in a db.model. And use a filter to get the right key based on language.

After a lot of testing I still can't get to work well enough. When published I still get error messages in the logs about to much CPU usage. A typical page has about 30-50 text elements. And according to the logs it uses about 1500ms (900ms API) for each page load. I'm starting to think this might not be the best approach?

I've tried using both memcache and indexes to get around the CPU usage. It helps a little. Should one use memcache and manually added indexes?

This is how my filter looks like:

import re
from google.appengine.ext import webapp
from google.appengine.api import memcache

from util import dictionary

register = webapp.template.create_template_register()

def translate(key, lang):
    re = "no key for " + key

    data = memcache.get("dictionary" + lang)

    if data is None:
        data = dictionary.GetDictionaryKey(lang)
        memcache.add("dictionary" + lang, data, 60)

    if key in data:
        return data[key]
    else:   
        return "no key for " + key


register.filter(translate)

And util.dictionary looks like this:

from google.appengine.ext import db

class DictionaryEntries(db.Model):
    lang = db.StringProperty()
    dkey = db.StringProperty()
    value = db.TextProperty()
    params = db.StringProperty()

    @property
    def itemid(self):
        return self.key().id()

def GetDictionaryKey(lang):
    entries = DictionaryEntries.all().filter("lang = ", lang)
    if entries.count() > 0:
        langObj = {}
        for entry in entries:
            langObj[entry.dkey] = entry.value

        return langObj 
    else:
        return False
fredrik
+2  A: 

Have you considered switching to standard gettext methods? Gettext is a widely spread approach for internationalization and very well embedded in the Python (and the Django) world.

Some links:

Your template would then look like this:

{% load i18n %}
<h1>{% trans "Header of my Collection" %}</h1>

The files for translations can be generated by manage.py:

manage.py makemessages -l fr

for generating french (fr) messages, for example.

Gettext is quite performant, so I doubt that you will experience a significant slow-down with this approach compared to your storage of the translation table in memcache. And what's more, it let's you work with "real" messages instead of abstract dictionary keys, which is, at least in my experience, ways better, if you have to read and understand the code (or if you have to find and change a certain message).

Boldewyn
I look at it before. And maybe it's a better approach. Only thing I can't quite figure out is how to generate the language files based on what a user adds to the database when hosted @ google. Is there a way to generate these files based on a db.Model ?
fredrik
Sorry, I might be a bit slow today, but what do you mean with "based on what a user adds to the database"? Do you want translations of arbitrary, user generated strings?
Boldewyn
Well, yes. I have a settings page where the admin-user can add/edit/remove entries from the dictionary. The dictionary has two string property called "lang" and "dkey" that I use to determine what entry to fetch. It kinda works like a light version of a CMS.
fredrik
Then it will get quite nasty with PO files. I'll wrap my brain around it and update the answer, if I find a nice way to go (that is, a way where you don't have to parse the .po file yourself...)
Boldewyn
Kinda figured. If you have any idea it would be MUCH appreciated !!
fredrik
+3  A: 

Your initial question is about high cpu usage, the answer i think is simple, with GAE and databases like BigTable (non-relational) the code with entries.count() is expensive and the for entry in entrie too if you have a lot of data.

I think you must have to do a couple of things:

in your utils.py

def GetDictionaryKey(lang, key):
    chache_key = 'dictionary_%s_%s' % (lang, key)
    data = memcache.get(cache_key)
    if not data:
         entry = DictionaryEntries.all().filter("lang = ", lang).filter("value =", key).get()
         if entry:
             data = memcache.add(cache_key, entry.value, 60)
         else:
             data = 'no result for %s' % key
    return data

and in your filter:

 def translate(key, lang):
     return dictionary.GetDictionaryKey(lang, key)

This approach is better because:

  • You don't make the expensive query of count
  • You respect the MVC pattern, because a filter is part of the Template (View in the pattern) and the method GetDictionaryKey is part of the Controler.

Besides, if you are using django i suggest you slugify your cache_key:

from django.template.defaultfilters import slugify
def GetDictionaryKey(lang, key):
    chache_key = 'dictionary_%s_%s' % (slugify(lang), slugify(key))
    data = memcache.get(cache_key)
    if not data:
         entry = DictionaryEntries.all().filter("lang = ", lang).filter("value =", key).get()
         if entry:
             data = memcache.add(cache_key, entry.value, 60)
         else:
             data = 'no result for %s' % key
    return data
diegueus9
+1 from me. I haven't tested it, but it's reasonable, that this would lead to a performance improvement. @fredrik: If it does, you should accept this answer. My gettext approach is good for texts developers can control, but I haven't found a sound way to do user input translation with Django's gettext support, yet.
Boldewyn
@diegueus9: Tanks. I'll try it as soon as I get home. Been gone a week on vacation. And I don't seem to be able to change the accept status. I'll email the stackoverflow team and see if they can change it.
fredrik
@diegueus9: Worked absolutely sweet! Thanks again. One thing. When adding to memcache, you need to do a get to. Since memcache.add just returns True or False if it was successfully added or not.
fredrik
@fredrik i'm glad work for you, and yes i have that error, but i write all of that code without test it ;)
diegueus9
@diegueus9: very nice work just on top of your head! Still waiting for a response form the stackoverflow team. Hope they change it so you'll be rewarded the bounty points.
fredrik