views:

605

answers:

3

Wikipedia is geotagging a lot of its articles. (Look in the top right corner of the page.)

Is there any API for querying all geotagged pages within a specified radius of a geographical position?

Update

Okay, so based on lost-theory's answer I tried this (on dbpedia query explorer):

PREFIX geo: <http://www.w3.org/2003/01/geo/wgs84_pos#&gt;
SELECT ?subject ?label ?lat ?long WHERE {
    ?subject geo:lat ?lat.
    ?subject geo:long ?long.
    ?subject rdfs:label ?label.
    FILTER(xsd:float(?lat) - 57.03185 <= 0.05 && 57.03185 - xsd:float(?lat) <= 0.05
        && xsd:float(?long) - 9.94513 <= 0.05 && 9.94513 - xsd:float(?long) <= 0.05
        && lang(?label) = "en"
    ).
} LIMIT 20

This is very close to what I want, except it returns results within a (local) square around the point and not a circle. Also I would like if the results where sorted based on the distance from the point. (If possible.)

Update 2

I am trying to determine the euclidean distance as an approximation of the true distance, But I am having trouble on squaring a number in sparql. (Question opened here.) When I get something useful I will update the question, but in the meantime I will appreciate any suggestions on alternative approaches.

Update 3

A final update. I gave up on using sparql trough dbpedia. I have written a simple parser which fetches the wikipedia article text nightly database dump and parses all articles for geocodes. It works rather nicely and it allows me to store information about geotagged articles however I wish.

This is probably the solution I will continue using, and if I get around to create a nice interface to it I might consider allowing public API access and/or publishing the source to the parser.

Thanks for all the suggestions, comments and help!

+1  A: 

There are a couple of tools listed on Tools and applications based on coordinates from Wikipedia. I'm not sure if it's what you're looking for, but the Geosearch.py tool looks pretty cool.

Bill the Lizard
+3  A: 

You should be able to query for latitude/longitude using SPARQL and dbpedia. An example (from here):

SELECT distinct ?s ?la ?lo ?name ?country WHERE {
?s dbpedia2:latitude ?la .
?s dbpedia2:longitude ?lo .
?s dbpedia2:officialName ?name .
?s dbpedia2:country ?country .
filter (
  regex(?country, 'England|Scotland|Wales|Ireland')
  and regex(?name, '^[Aa]')
)
}

You can run your own queries here.

lost-theory
Very interesting. I am unsure about this SPARQL syntax, and how to perform a query for all articles within a specific area (defined by latitude, longitude and radius) ?
bjarkef
I'm unsure if SPARQL supports trigonometrical functions (it doesn't appear to); but you could filter your data set to a square to get a first "cut" of results, and then do great circle distances "client side", and apply a second set of filtering.
Rowland Shaw
A: 

I'm not familiar enough with SPARQL, but if it can use power in its filter then its easy to compute the distance of a given article from a given point using Pythagoras theorem (a^2 + b^2 = c^2) and that would give you all the articles in a radius.

Another option would be to get a Wikipedia data dump and process it yourself - this is what I did when I needed to do some linguistic analysis on Wikipedia article.

Guss
This is what I am trying to get working right now. The results would be off close to the poles or at large radii, as latitude and longitude are not cartesian coordinates, but will probably be approximately okay locally.However I simply have no idea how to computer the power of something in sparql, or even of where to look up how to compute the power. I opened a question on it here: http://stackoverflow.com/questions/1401401/power-in-sparql-and-other-math-functionsWhen I reach a satisfying solution I will update the question, but until then, I will appreciate any suggestions. :)
bjarkef
I've looked in the SPARQL reference on W3 before putting this answer and the math operation I've seen there are less then satisfactory. That being said, there was some discussion on adding operators using embedded Javascript, which may be a solution but I didn't dive into that due to lack of time.
Guss
Sounds what I have found. I guessed the square root operator (math:sqrt) which works, but even that seems not to be documented at the W3 page. And this is not for displaying on a web-page, so I am unsure how any javascript solution will help, (though I noticed that discussion myself.)
bjarkef
Its quite possible for a SPARQL processor to have a javascript parser to handle that. If you can get `math:sqrt` to work, then `math:pow` may also work.
Guss
math:pow didn't work for me, nor trying to multiply value by themselves (some compiler error about syntax error at '(' which I didn't understand).
Guss
My problem exactly. What I really need is a good specification of the SPARQL syntax and available 'libraries'.
bjarkef