We have the same situation here, i.e. our product as well is required to show meaningful URLs to the user in potentially every language on earth. All our tools and techniques are supporting UTF-8, so no problem with that. Escaping the UTF-8 characters technically works, but IE (7, 8) shows the ugly looking escaped URLs whereas Firefox unescapes them and displays nice urls, i.e. '/français/Banane.html' will be displayed in IE as '/fran%C3%A7ais/Banane.html'.
GET after POST / redirecting after form submits did not work at all, neither sending UTF-8 urls nor escaped UTF-8 urls. We also tried to use XML-style numeric entity coding without success.
However, we finally found a way to successfully redirect after a POST: encoding the UTF-8 string bytewise using ISO-8859-1. None of us really understands how this can work anyway (how can the browser know how to decode that, as the number of bytes per utf-8 character may vary and how does the browser know, it originally was utf-8?) , but it does.
Here's a simple servlet to try that out:
package springapp.web.servlet;
import java.io.FileNotFoundException;
import java.io.IOException;
import java.io.InputStream;
import javax.servlet.ServletContext;
import javax.servlet.ServletException;
import javax.servlet.http.HttpServlet;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;
import org.apache.commons.io.IOUtils;
public class TestServlet extends HttpServlet {
private static final long serialVersionUID = -1743198460341004958L;
/* (non-Javadoc)
* @see javax.servlet.http.HttpServlet#doGet(javax.servlet.http.HttpServletRequest, javax.servlet.http.HttpServletResponse)
*/
@Override
protected void doGet(HttpServletRequest req, HttpServletResponse resp)
throws ServletException, IOException {
String url = "çöffte.html";
try {
ServletContext context = req.getSession().getServletContext();
// read utf8 encoded russian url
if (context.getResource("/WEB-INF/ru_url.txt") != null){
InputStream is = context.getResourceAsStream("/WEB-INF/ru_url.txt");
if (is != null){
url = IOUtils.toString(is, "UTF-8");
System.out.println(String.format("Redirecting to [%s]", url));
}
}
}
catch (FileNotFoundException fNFEx) {
fNFEx.printStackTrace();
}
catch (IOException ioEx) {
ioEx.printStackTrace();
}
byte[] utfBytes = url.getBytes("UTF-8");
String result = new String(utfBytes, "ISO-8859-1");
resp.sendRedirect(result);
// does not work:
//resp.sendRedirect(url);
//resp.sendRedirect(Utf8UrlEscaper.escapeUtf8(url));
//resp.sendRedirect(Utf8UrlEscaper.escapeToNumericEntity(url));
}
}
For the redirect target copy and paste any native language url e.g. from wikipedia in a utf-8 encoded (without BOM!) file and save that in the WEB-INF directory. In our example we took a russian url (http://ru.wikipedia.org/wiki/Заглавная_страница)
and save that in a file named 'ru_url.txt'.
We created a simple SpringMVC application mapping any *.abc url to the test servlet.
Now if you start the app and enter something like 'localhost:8080/springmvctest/a.abc' you should be redirected to the russian wikipedia site and the browser (IE and Firefox, Safari or else possibly not) should show a nice utf-8 encoded, native russion url.