views:

139

answers:

4

how do you handle them?

+2  A: 

In domain names, you can do use IDN domains, if they are supported by the registrars you want to register them with.

Elsewhere in the URL, they are generally sent by the browser as utf8 urlencoded. Only recently I was looking at:

http://en.wikipedia.org/wiki/Pfeffern%C3%BCsse

And found it curious that there was a ü in the URL. Firefox shows it as a proper character though.

MarkR
A: 

You'll want to have a look at both IDNA and Punycode, which are the standards that handle this in domain names.

Jim Puls
+1  A: 

You might want to take a look at RFC 3986 Uniform Resource Locator: General Syntax. This specifies how to handle non-ASCII characters in URLs. The general idea is to UTF-8 encode each character, convert each resulting byte to its two-digit hex value and append a '%'.

Of course, anothing option is just to strip them out of a URL or replace with something like an underscore, it depends on your requirements.

roryf
A: 

The problem with those names is that they are easily confused with other characters. So I should need a very good reason to use them. For example if your company name is "Schröder", I would use both schröder.com, schroder.com and even schroeder.com, the extra cost is justified because it is just to easy to create a malicious name.

Gamecat