I would like to write a C# method that would transform any title into a URL friendly string, similar to what stackoverflow does:
- replace spaces with dashes
- remove parenthesis
- etc.
I'm thinking of removing Reserved characters as per RFC 3986 standard (from Wikipedia) but I don't know if that would be enough? It would make links workable, but does anyone know what other characters are being replaced here at stackoverflow? I don't want to end up with %-s in my URLs...
Current implementation
string result = Regex.Replace(value.Trim(), @"[!*'""`();:@&+=$,/\\?%#\[\]<>«»{}_]");
return Regex.Replace(result.Trim(), @"[\s*[\-–—\s]\s*]", "-");
My questions
- Which characters should I remove?
- Should I limit the maximum length of resulting string?
- Anyone know which rules are applied on titles here on SO?
A sub-question
Should I move this question to meta even though it's programming related?