views:

113

answers:

1

I have some form on the website where users can add new pages. I must generate SEO friendly URLs and make this URLs unique. What characters can I display in URL, I know that spaces I should convert to underscore:

" "->"_" and before it - underscores to something else, for example:

"_"->/underscore

It is easy make title from URL back.

But in my specific title can be all characters from keyboard, even : @#%:"{/\';.>

Are some contraindications to don't use this characters in URL?

Important is:

-easy generating URL and title from URL back (without queries to database)

-each title are unique, so URL must be too

-SEO friendly URLs

+1  A: 

Aren't you querying the database to get the content anyway? In which case just grab the title field in the same query.

The only way to reliably get the title back from the URL is to 'URL encode' it (in PHP you use the urlencode() function). However, you will end up with URLs like this:

My%20page%20title

You can't replace any characters because you will then not have unique URLs. If you are replacing spaces with underscores, for example, the following titles will all produce the same URL:

My page title
My_page title
My_page_title

In short: don't worry about one extra database hit and just use SEO-friendly URLs by limiting to lowercase a-z, 0-9 and dashes, like my-page-title. Like I said, you can just grab everything in one query anyway.

DisgruntledGoat
Yes, but I can have more problems with inserting records (we have to generate unique url) and additional column in table.I can convert firstly undescores to for example "(.)" and later spaces to undescores, so it is not problem.
Thomas
But link will be: http://example.com/page_sdf(.)sdfSo it is not SEO-friendly, I think. But when I browsing wikipedia pages I can see many such links, for example:http://pl.wikipedia.org/wiki/A* or http://pl.wikipedia.org/wiki/%22Ad_leones!%22
Thomas
And important is that normal links (without underscores and other strange characters) will be much less in my website.
Thomas
If you convert underscores to something else you still have the same problem. Maybe it will rarely occur but it's still possible if someone includes "(.)" in their title.
DisgruntledGoat
Also to create unique URLs, why don't you use a solution like Stack Overflow and include a unique ID in the URL? This will also speed up the database queries because you're just looking up via ID, not a long text string.
DisgruntledGoat
My Website is something similar to Wikipedia, If you search some phrase (GET) I redirect you to SEO-friendly URL and if this title is not in table you see that information and you can add new post to this title.
Thomas
Yes, Stack Overflow has SEO-friendly and productive solution, but in my website have not only direct links to user posts, but there are very important searcher, so in Stack Overflow solution I must check if searched phrase (from GET) is in table, if yes, I will redirect from GET link to SEO-friendly and there I must get record from table again (now I can use id). So I prefer some solution like Wikipedia.
Thomas
We can use session to save this information from searching, but it is too less eficient, I think.
Thomas
It is possible to change characters to ASCI values, sum it , and save it instead of id.But order of characters we must save too, so first, second... characters must have other multiplier.
Thomas
But it will be very long number. 3 digits/character
Thomas
But of course we can use hexadecimal system or our own with all alfabet :), so in url will be only :a-z, 0-9 and undescore.
Thomas
Is it efficient?
Thomas
Do you ever think anything you *don't* post? ;) (You can use the edit link to edit your comment, for a few minutes at least.) Seriously though, I'm not following you. Why not just add something that checks if the URL generated from the title is unique? Then don't let people submit things with duplicate titles or URLs, or add a number to the end, like `My_page_title`, `My_page_title_2` and so on.
DisgruntledGoat