tags:

views:

77

answers:

2

HI,

I have a string that could contain a longitude and a latitude. The string could contain anything but if it does contain a lon/lat then I want to extract it using php. I think I need a regular expression but I don't have a clue how to get that out. The string could contain anything:

a random string dfdff33338983 33.707352,-116.272797 more dfdfndfdf

+4  A: 

If the string could contain anything, then there is no regular expression, or indeed any piece of code that could extract the longitude and latitude.

This can be confirmed with the following string:

7.123456,40.404040 is nothing like 33.707352,-116.272797 or 99.111222,-22.333444.

Which one of those is the latitude?

You could try something like:

\b-?\d+\.\d{6},-?\d+\.\d{6}\b

as a starting point.

paxdiablo
Oh ok. I was thinking something like...if there's an integer in the string and if that number has a dot either 2 or 3 characters along from that then I could pretty much be fairly sure it was a longitude...from there just take the whole string up to the next comma. That would give you the longitude...then look for the minus and a character and take that string up to the next space and that would give you the lat. It doesn't need to be perfect...just be good if i could get it most of the time.Is this no do-able?
elduderino
actually that wouldn't work because sometimes long lat can start with a minus sign and others they don't
elduderino
@elduderino, that's a lot different than "string could contain anything" :-) Have a look at the regex I provided as a starting point. You can use `?` to indicate zero or one occurrences.
paxdiablo
Note that there is no reason why a lat/lon combination must include decimal points. 60,10 is a perfectly valid coordinate. Decimal points may even be swapped for commas if your input is depending on a locale.
relet
ok i'll give this a go. thanks
elduderino
+1  A: 

You can match using this:

.*\s(.*),(.*?)\s.*

See this answer in rubular.

Answer in php:

$txt = "dfdff333 38983 33.707352,-116.272797 dfd fndfdf";
$lat = preg_replace("/.*\s(.*),.*?\s.*/", "$1", $txt);
$lon = preg_replace("/.*\s.*,(.*?)\s.*/", "$1", $txt);

echo $lat."\n"; // 33.707352
echo $lon."\n"; // -116.272797

Note: I'm using comma as delimiter.


EDIT: you can use a more specif regex, like

$lat = preg_replace("/.*\s(-?\d+\.\d+),-?\d+\.\d+?\s.*/", "$1", $txt);
$lon = preg_replace("/.*\s-?\d+\.\d+,(-?\d+\.\d+?)\s.*/", "$1", $txt);

Tks @soapbox.

Topera
that seems to work really well....but I really need to do it in. would that regex work in PHP?
elduderino
Check the answer now. I changed to php answer. :)
Topera
I disapprove of using .* to match the numbers. You should be using `-?\d+\.\d+` because they have a well defined format. using .* here will match all kinds of other unexpected things.
SoapBox
@SoapBox: you're right! Tks.
Topera
Brilliant....it works for me...hasn't failed once in 300 ish strings
elduderino