What kind of work has been done to determine whether a specific string pertains to a geographical location? For example:
'troy, ny'
'austin, texas'
'hotels in las vegas, nv'
I guess what I'm sort of expecting is a statistical approach that gives a degree of confidence that the first two are locations. The last one would probably require a heuristic which grabs "%s, %s" and then uses the same technique. I'm specifically looking for approaches that don't rely too heavily on the proposition 'in', seeing as it's not an entirely unambiguous or consistently available indicator of location.
Can anyone point me to approaches, papers, or existing utilities? Thanks!