views:

43

answers:

2

I want to split the calendar into two-week intervals starting at 2008-May-5, or any arbitrary starting point.

So I start with several date objects:

import datetime as DT

raw = ("2010-08-01",
       "2010-06-25",
       "2010-07-01",
       "2010-07-08")

transactions = [(DT.datetime.strptime(datestring, "%Y-%m-%d").date(),
                 "Some data here") for datestring in raw]
transactions.sort()

By manually analyzing the dates, I am quite able to figure out which dates fall within the same fortnight interval. I want to get grouping that's similar to this one:

# Fortnight interval 1
(datetime.date(2010, 6, 25), 'Some data here')
(datetime.date(2010, 7, 1), 'Some data here')
(datetime.date(2010, 7, 8), 'Some data here')

# Fortnight interval 2
(datetime.date(2010, 8, 1), 'Some data here')
+3  A: 
import datetime as DT
import itertools

start_date=DT.date(2008,5,5)

def mkdate(datestring):
    return DT.datetime.strptime(datestring, "%Y-%m-%d").date()

def fortnight(date):
    return (date-start_date).days //14

raw = ("2010-08-01",
       "2010-06-25",
       "2010-07-01",
       "2010-07-08")
transactions=[(date,"Some data") for date in map(mkdate,raw)]
transactions.sort(key=lambda (date,data):date)

for key,grp in itertools.groupby(transactions,key=lambda (date,data):fortnight(date)):
    print(key,list(grp))

yields

# (55, [(datetime.date(2010, 6, 25), 'Some data')])
# (56, [(datetime.date(2010, 7, 1), 'Some data'), (datetime.date(2010, 7, 8), 'Some data')])
# (58, [(datetime.date(2010, 8, 1), 'Some data')])

Note that 2010-6-25 is in the 55th fortnight from 2008-5-5, while 2010-7-1 is in the 56th. If you want them grouped together, simply change start_date (to something like 2008-5-16).

PS. The key tool used above is itertools.groupby, which is explained in detail here.

Edit: The lambdas are simply a way to make "anonymous" functions. (They are anonymous in the sense that they are not given names like functions defined by def). Anywhere you see a lambda, it is also possible to use a def to create an equivalent function. For example, you could do this:

import operator
transactions.sort(key=operator.itemgetter(0))

def transaction_fortnight(transaction):
    date,data=transaction
    return fortnight(date)

for key,grp in itertools.groupby(transactions,key=transaction_fortnight):
    print(key,list(grp))
unutbu
`//14` is the same as `/14` in Python2, but is necessary in Python3 to get integer division (since `/14` gives floating-point division in Python3). By using `//14` you future-proof your code a little bit. See http://docs.python.org/library/stdtypes.html#numeric-types-int-float-long-complex
unutbu
// is used as integer division but actually it is division by numbers with result automatically rounded down to nearest integer. When used with floats the result stays float.
Tony Veijalainen
I'm not sure if I understand how the `lambda` works here. As I understand about `lambdas`, they're particularly useful for making them work over `iterable`s. Do `sort()` and `groupby()` perform some iteration operations on their `key`s?
Kit
Thank you, @Tony. It's good to point out that `//` is *not* the same as `/` (in Python2) when operating on floats.
unutbu
@Kit: In `groupby`, each element in `transactions` is handed to the `key` function. An element of `transactions` is a tuple `(date,data)`. The `key` function `lambda (date,data):fortnight(date)` receives `(date,data`) as input and returns `fortnight(date)`. This is just an integer used to classify which group `(date,data)` should be grouped with.
unutbu
@Kit: `lambda`s are just a way to create functions. They don't necessarily have anything to do with iterables, but you are right, they show up a lot with `sort` and `groupby` because those functions take `key` arguments which expect functions. I could rewrite the above without any `lambda`s. I'll edit my post to show what I mean.
unutbu
Please validate my understanding. In every iteration of `groupby`, `key` __sometimes__ receives a new integer value (or any type, integers at least in this example). Then every element of `transactions` that gets the same `key` gets grouped together. Am I correct?
Kit
You are correct Kit, @unutbu: // is not limited for Python3, it is same in Python2 also (actually I normally use Python 2.7 and tested it there). So yes it is not same as Python2 /, but same as Python2 //.
Tony Veijalainen
`groupy` iterates over the elements of `transactions`. Each element is passed to the function specified by the `key` argument. The `key` function does not receive an integer, it returns an integer. The consecutive elements that have the same integer are grouped together.
unutbu
This is very helpful. Thank you!
Kit
@Kit: You're very welcome. `itertools` is a great tool to have in your pocket, well worth every second spent studying it.
unutbu
+1  A: 

Use itertools groupby with lambda function to divide by the length of period the distance from starting point.

>>> for i, group in groupby(range(30), lambda x: x // 7):
    print list(group)


[0, 1, 2, 3, 4, 5, 6]
[7, 8, 9, 10, 11, 12, 13]
[14, 15, 16, 17, 18, 19, 20]
[21, 22, 23, 24, 25, 26, 27]
[28, 29]

So with dates:

import itertools as it
start = DT.date(2008,5,5)
lenperiod = 14

for fnight,info in it.groupby(transactions,lambda data: (data[0]-start).days // lenperiod):
    print list(info)

You can use also weeknumbers from strftime, and lenperiod in number of weeks:

for fnight,info in it.groupby(transactions,lambda data: int (data[0].strftime('%W')) // lenperiod):
    print list(info)
Tony Veijalainen