views:

372

answers:

1

Hello,

I'm working with UK address data and also International address data.

I need to geocode the address data for use on a google map. I'm doing this using the HTTP service. Ie/ Constructing a query string and passing it to file_get_contents($THEURL).

I've managed to geocode 80% of the address data perfectly, however those addresses in countries like Norway and Sweeden that contain special characters will not return a geocode.The code returned is 602 (cannot find an addres).

Looking into the documentation I can see that the string sent to google must be UTF8 encoded.

I've tried the following to ensure the string is UTF8 encoded / remove the special characters.

1) Using UTF8 encode on the query string - this often results in malformed characters being displayed on the screen.

2) mb_check_encoding reports the string is corrrectly encoded.

3) Using a function to substitue special charcters for thier europiene eqivilents (in the hope google api will compensate.

Can anyone suggest a reason why my method isn't working (whether to do with encoding or not?).

Thanks,

Ben

+3  A: 

You need to systematically go through every encoding aspect in your system and define what encoding it is in. Mb_detect_encoding and guesswork are not a good approach here.

You need to check the encoding of:

  • incoming data
    • pages
    • GET parameters
    • database connection
    • database table collations
  • the script files you work with

If malformed characters occur, chances are you are using ISO-8859-1 or some other non-UTF-8 encoding somewhere. When everything is clean UTF-8, the request should go through.

A very good article on the basics is The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!).

Pekka
It was the connection! Genius.
Ben Waine