views:

1921

answers:

3

I am trying to use google translate from python with utf-8 text. How do I call the json api? They have a document for embedding it in html but I can't find a proper API or wsdl anywhere.

Thanks Raphael

A: 

I think you are talking about the ajax api http://code.google.com/apis/ajaxlanguage/, which has to be used from javascript, so I do not understand what do you mean by "google translate from python"

Alternatively if you need to use translate functionality from python, you can directly query the translate page and parse it using xml/html libs e.g. beautiful soup, html5lib

Actually I did that once and beautiful soup did not work on google translate but html5lib(http://code.google.com/p/html5lib/) did

you will need to do something like this (copied from my larger code base)

def translate(text, tlan, slan="en"):

    opener = urllib2.build_opener()
    opener.addheaders = [('User-agent', 'translate.py/0.1')]

    htmlPage = opener.open(
            "http://translate.google.com/translate_t?" + 
            urllib.urlencode({'sl': slan, 'tl':tlan}),
            data=urllib.urlencode({'hl': 'en',
                                   'ie': 'UTF8',
                                   'text': text.encode('utf-8'),
                                   'sl': slan, 'tl': tlan})
        )

    parser = html5lib.HTMLParser(tree=treebuilders.getTreeBuilder("etree", cElementTree))

    etree_document = parser.parse(htmlPage)

    return _getResult(etree_document)
Anurag Uniyal
I used a similar script before but google banned me for suspicious activity. I was looking for use of their ajax api.
+3  A: 

Here is the code that finally works for me. Using the website without the ajax api can get your ip banned, so this is better.

#!/usr/bin/env python
from urllib2 import urlopen
from urllib import urlencode
import urllib2
import urllib
import simplejson
import sys

# The google translate API can be found here:
# http://code.google.com/apis/ajaxlanguage/documentation/#Examples
def translate(text = 'hola querida'):
    tl="es"
    sl="en"
    langpair='%s|%s'%(tl,sl)



    base_url='http://ajax.googleapis.com/ajax/services/language/translate?'
    data = urllib.urlencode({'v':1.0,'ie': 'UTF8', 'q': text.encode('utf-8'),
                             'langpair':langpair})


    url = base_url+data

    search_results = urllib.urlopen(url)

    json = simplejson.loads(search_results.read())


    result = json['responseData']['translatedText']
    return result
+1  A: 

Look what I have found : http://code.google.com/intl/ru/apis/ajaxlanguage/terms.html

Here is the interesting part:

You will not, and will not permit your end users or other third parties to: .... * submit any request exceeding 5000 characters in length; ....

lordspace