Better URLs in PHP (without htaccess)

Posted Wednesday, June 8, 2005

Filed under: , , , ,

I wanted for a while to improve the readability of URLs on my website but without enough time to dive into htaccess. As a temporary fix I chose to expose the title in the URL, together with the id. Here's the how and why.

[EDIT: I have updated the function to replace accented characters and punctuation.]
First, you need to know that when it comes to SEO URLS and semantics matter. A lot.
Using various levels of header (instead of classes) boosted my google visibility very impressively, but semantics are mostly good for machines. So I moved along and after changing the page title according to the content (which is good for both humans and machines) I made my URLs more human-friendly, because a numeric id is not informative at all except to one database in particular.
I'm rewriting the whole backend so I didn't want to use htaccess and mod_rewrite for this, yet be able to reuse the components.
Even if you still can't have friendly URLs like archive/user-friendly-urls, something like index.php?id=123456/user+friendly+urls still is much more readable. And it only requires that you url-encode your titles to include them in the link. I use the following function:

function make_friendly_URL($input) {
$trans = array("¥" => "Y", "µ" => "u", "À" => "A", "Á" => "A", "Â" => "A", "Ã" => "A", "Ä" => "A", "Å" => "A", "Æ" => "A", "Ç" => "C", "È" => "E", "É" => "E", "Ê" => "E", "Ë" => "E", "Ì" => "I", "Í" => "I", "Î" => "I", "Ï" => "I", "Ð" => "D", "Ñ" => "N", "Ò" => "O", "Ó" => "O", "Ô" => "O", "Õ" => "O", "Ö" => "O", "Ø" => "O", "Ù" => "U", "Ú" => "U", "Û" => "U", "Ü" => "U", "Ý" => "Y", "ß" => "s", "à" => "a", "á" => "a", "â" => "a", "ã" => "a", "ä" => "a", "å" => "a", "æ" => "a", "ç" => "c", "è" => "e", "é" => "e", "ê" => "e", "ë" => "e", "ì" => "i", "í" => "i", "î" => "i", "ï" => "i", "ð" => "o", "ñ" => "n", "ò" => "o", "ó" => "o", "ô" => "o", "õ" => "o", "ö" => "o", "ø" => "o", "ù" => "u", "ú" => "u", "û" => "u", "ü" => "u", "ý" => "y", "ÿ" => "y", "I'll" => "I+will", "'" => "+", "(" => "", ")" => "", "!" => "", " " => "+", " - " => "+/+");
$input = strtr($input, $trans);
return $input;
}

Explanation: the array associates accents with their unaccented letter, and punctuation with an empty string. Also, since I often separate French and English in titles with space-dash-space I want this sequence to retain its sparative role.
Spaces are converted to + signs. At first I used an underscore (for readability) but then I discovered that it is a stop word (compare the number of results with key_word, key-word and finally key+word).
Now you just need to add this to your links, and let the world grow used to it...
As a last tip: google is reputedly the only search engine crawling pages after the ? sign, so beware and try to use the htacces method if you can.

[UPDATE: The following tutorial explains most of the uses .htaccess can be put to. Worth reading if you want to read further.]

Comments disabled because of spammers.

No comments yet

Technorati Profile