tags:

views:

381

answers:

1

I have a python script that is constantly grabbing data from Twitter and writing the messages to a file. The question that I have is every hour, I want my program to write the current time to the file. Below is my script. Currently, it gets into the timestamp function and just keeps printing out the time every 10 seconds.

#! /usr/bin/env python
import tweetstream
import simplejson
import urllib
import time
import datetime
import sched

class twit: 
    def __init__(self,uname,pswd,filepath):
        self.uname=uname
        self.password=pswd
        self.filepath=open(filepath,"wb")

    def main(self):
        i=0
        s = sched.scheduler(time.time, time.sleep)
        output=self.filepath

        #Grab every tweet using Streaming API
        with tweetstream.TweetStream(self.uname, self.password) as stream:
            for tweet in stream:
                if tweet.has_key("text"):
                    try:
                        #Write tweet to file and print it to STDOUT
                        message=tweet['text']+ "\n"
                        output.write(message)
                        print tweet['user']['screen_name'] + ": " + tweet['text'], "\n"

                        ################################
                        #Timestamp code
                        #Timestamps should be placed once every hour
                        s.enter(10, 1, t.timestamp, (s,))
                        s.run()
                    except KeyError:
                        pass
    def timestamp(self,sc):
        now = datetime.datetime.now()
        current_time= now.strftime("%Y-%m-%d %H:%M")
        print current_time
        self.filepath.write(current_time+"\n")


if __name__=='__main__':
    t=twit("rohanbk","cookie","tweets.txt")
    t.main()

Is there anyway for my script to do it without constantly checking the time every other minute with an IF statement to see how much time has elapsed? Can I use a scheduled task like how I've done above with a slight modification to my current implementation?

+3  A: 

your code

sc.enter(10, 1, t.timestamp, (sc,)

is asking to be scheduled again in 10 seconds. If you want to be scheduled once an hour,

sc.enter(3600, 1, t.timestamp, (sc,)

seems better, since an hour is 3600 seconds, not 10!

Also, the line

s.enter(1, 1, t.timestamp, (s,))

gets a timestamp 1 second after every tweet written -- what's the point of that? Just schedule the first invocation of timestamp once, outside the loop, as well as changing its periodicity from 10 seconds to 3600.

Alex Martelli