tags:

views:

589

answers:

5

Is there a good way to determine how many pages the twitter search api has returned or is there a way to determine how many values were returned and divide that by the number of twits per page?

A: 

No, the results don't include the number of pages. The ATOM data does include a 'next page' element that you can iteratively follow, until that element isn't there anymore.

<link type="application/atom+xml" rel="next" href="http://search.twitter.com/search.atom?lang=en&amp;amp;max_id=1775692928&amp;amp;page=11&amp;amp;q=YOURQUERY"/&gt;
great_llama
A: 

So you could potentially run a loop through the pages until your provided with an empty query result.

Tim
or until your software gets tired of doing so. :)
Jason S
+4  A: 

No. The API does not expose this; not because it's not a useful feature, but because of the performance aspects of providing it.

In order to get a complete count of results, it is necessary for the search algorithm to completely iterate its index for each query. Then when you went back for the second page, it would have to iterate its index from page 2 onward to give you the count again. This means that getting all the data would be O(n^2) (because returning each of the N pages requires scanning all the later pages) instead of the expected O(n).

Because most requestors only want a few pages of results, it's a common optimization for the query to return only partial results, with just a pointer into the index to allow the search to continue at the point it left off.

Most high-scale paginated APIs behave in a similar fashion for these reasons. To get an accurate count, you'll have to force the query to completely iterate its index by looping through the pages. This comes with a high cost to the remote service, and making you come back many times allows the service to appropriately throttle your query so it does not negatively impact other users.

Brad B
A: 

It's worth mentioning that the total number of pages can also vary based on the rpp parameter which controls the number of tweets returned per page (max of 100).

According to the search API docs, the page parameter returns pages only up to a maximum of ~1500 total results.

Mark Biek
A: 

Following what great_ilama said, how do you iteratively check the tag and follow that up with a loop so it keeps retrieving results for every page that has a "next"? I've tried using XMLDocument but I think my implementations is wrong.

youngscientist
This should have been a comment, not an answer.
honk