views:

264

answers:

2

Hi, goodday.

For a research project I would like to get the last 3 months worth of Twitter messages. Technical challenges aside, is this possible? by using some sort of slow polling mechanism to keep the rate limiter at bay?

The Twitter API states "Clients may request up to 3,200 statuses via the page and count parameters for timeline REST API" Are these per hour? Per day? or...ever?

Any suggestions? Would it even be theoretically possible? Did some one do something similar before?

Thanks! Marco

A: 

You could use the Search API, don't give it a search, return the maximum of 100 per page, then got through each page twice a minute(120 times an hour - 30 times less than the rate limit). However, if my math is correct, that could possibly give you 720,000 tweets an hour..... the problem is that Twitter has added approximately 1.75 billion tweets over the past 3 months. So if my math is correct, it would take you 2361 days, or 6 years to complete this.

You could ask this question over on the Twitter Development talk on Google Groups, or contact Twitter to get white-listed so you could make up to 20,000 requests an hour.

Personally, I don't think it's possible.

Eclipsed4utoo
So, in that case, it's more of a -get as much as possible, and factor in the estimated percentage that is not dumped? i'm whitelisted, so it would probably take about 20 day's then if I would like to get all of them...in theory.
A: 

Twitter notoriously does not make "available" tweets older than three weeks. In some cases you can only get one week. You're better off storing tweets for the next three months. Many rightly doubt if they're even persisted by Twitter.

Are you looking for just any tweets? If so, check out the Streaming API's status/sample method. The streaming API uses persistent HTTP sockets that can be a pain to program, but it's quite graceful when you get it working. I'd recommend setting up a little script to dump tweets from status/sample into a DB. You should have a TON of data after just a few days.

Ted Pennings