I've been told to understand how to maximize the visibility of an upcoming web application that is initially available in multiple languages, specifically French and English.
I am interested in understanding how the robots, like the google bot, scrapes a site that is available in multiple language.
I have a few questions concerning the behaviour of robots and indexing engines:
- Should a web site specify the language in the URL?
- Will a robot scrape a site in both language if the language is set through cookies (supposing a link that can change the language)?
- Should I use a distinct domain for each language?
- What meta tag could be used to help a robot in understanding the language of a web site?
- Am I missing anything that I should be aware of?