views:

1070

answers:

7

So must of us have a lot of content on our sites in one language or another. Since we are web professionals we spent all that time we could have been learning human languages - instead learning computer languages. So we need someway to translate our content.

Google provides a translation service (among others) and so given their massive empire I am confident that they do (or shortly will) have the best translation service. With that in mind, what is the best way to use it? We could just be lazy and use the little widget that they provide - but we would lose all the content and SEO juice because google would rewrite the links to point to "translate.googleusercontent.com?translate=...".

So my question is - how we can use this service while retaining the translated content on our site?

One method would be to use the Google AJAX API to load the content inline when the wants it. But since it is powered by JS (like jQuery)- Search Engines won't benefit from this.

Another method would be to use a server side language (like PHP) to scrap the content from the google translate page. But I'm not sure this is 100% legal.

Finally, I was wondering about using mod_rewrite to fetch the page. But again, I don't think this would benefit our site.

RewriteRule ^(.*)-fr$ http://www.google.com/translate_c?hl=fr&sl=en&u=http://site.com/$1 [R,NC]
RewriteRule ^(.*)-de$ http://www.google.com/translate_c?hl=de&sl=en&u=http://site.com/$1 [R,NC]
RewriteRule ^(.*)-es$ http://www.google.com/translate_c?hl=es&sl=en&u=http://site.com/$1 [R,NC]
RewriteRule ^(.*)-it$ http://www.google.com/translate_c?hl=it&sl=en&u=http://site.com/$1 [R,NC]

All you would need to do is add a a couple links on your pages with the variables “-fr” appended to the end of what ever URL is in the link and your set.

//View file
View Page in <a href="<?php print $uri_string; ?>-de">German</a>

Does anyone have any thoughts on this?

:EDIT:

After reading google's Terms of Service it seems that

You will not, and will not permit your end users or other third parties to: incorporate Google Results as the primary content on your Property or any page on your Property; submit any request exceeding 5000 characters in length;

Which sounds to me like you can't use the google translate URL to translate the main content - with PHP or AJAX - if that content is the main post of the page. Now how does this work? Why would you build a translation API and then not allow it to be used on the main page content?

+1  A: 

It looks like there is an (unofficial) API for php to translate using Google translate. It appears to be unofficial, but it's hosted on Google code, so if it's something that Google didn't want, it would probably be gone by now.

You should make sure to cache the translated pages though.

http://code.google.com/p/gtranslate-api-php/

Jeffrey Aylesworth
Yes, there are a several PHP libraries out there. http://code.google.com/p/php-language-api/
Xeoncross
A: 

After more research, apparently google does expose the JSON URL to make direct requests - so using a server side language does seem to be an option (as long as they are cached). However, once you get that content you still need to figure out how to allow users to access it in the flow of your current app. Perhaps something like the mod_rewrite method mentioned above?

Xeoncross
+2  A: 

Well, you should read the EULA, maybe google doesn't want you to use it's service in that way.

Not to mention that Google Translate may be fine across indo-european languages, but right now, translations to other families of languages really suck, and generate comical, meaningless text (e.g. my own language, Hungarian, is a nightmare for Google). I don't think it'll advance to an at least usable level in the near future.

Tamás Szelei
Yes, I would agree that computers have a long way to go to comprehend human speech. But since I neither speak, nor have the means to translate, Hungarian - I must work with what I do have. Even if most of the text is gibberish - a good address or sentence can really help on some of my sites.
Xeoncross
The user will instantly notice that the site is translated by a machine, and will quickly look elsewhere. No one would link to that page for the same reason (the "translation" effectively destroys the information).
Tamás Szelei
Again, your speaking about content rich sites - like blogs and articles. Some of my sites are not for reading pleasure, but for information. Information that is probably not listed elsewhere. So in my case, something is better than nothing.
Xeoncross
@Xeoncross: It sounds like you have quite little textual content on these sites and you assume that since it's only a few words that need to be translated, it's better to have something than nothing. I tend to think the exact opposite: I read some machine translated texts and sometimes they were useful. But only because I was able to make *some* sense out of it because I had enough context! The less words, the higher the risk that something could be translated (and understood) in a wrong way.
paprika
A: 

I think the most SEO friendly way to decide what language to display is to look at the Accept-Language request header, although language flag icons wouldn't be a bad idea either, in case someone using an en-us browser feels more comfortable reading French, for example.

Jeffrey Aylesworth
A: 

You can translate text through the google language api's REST interface. Here is a PHP library that does it: http://code.google.com/p/php-language-api/

A simple example is on the project page.

bucabay
Thanks, but I already shared that link. ;)
Xeoncross
I didn't notice that. :)
bucabay
+1  A: 

To have a real multilingual site, automated translations are not and will not be a good enough solution. On my site, I've added an interface allowing easy human translation and Google translate (as well as babelfish) is used for suggesting translations before a real human does the actual translations. Check the project at http://transposh.org/ is your site is on WordPress

Collector
Great implementation of crowd sourcing while using a best-practice fallback.
Xeoncross
A: 

Thank you, Collector!!!! I love the fact that anybody can edit the translation result.. it helps so much. Great great great job!

phanyly