views:

69

answers:

2

Hello,

I am currently using the following code to get the intersection date column of two sets of financial data. The arrays include date, o,h,l,cl

#find intersection of date strings
def intersect(seq1, seq2):
    res = []                     # start empty
    for x in seq1:               # scan seq1
        if x in seq2:            # common item?
            res.append(x)


    return res


x = intersect(seta[:,0], setb[:,0])    # mixed types
print x

The problem is it only returns the column for which it found the intersection of both, namely the date column. I would like it to somehow return a different column array including both the cls values of each set ... ie.. if date is common to both return a 2X1 array of the two corresponding cls columns. Any ideas? thanks.

A: 

Okay, here is a complete solution.

Get a python library to download stocks quotes

Get some quotes

start_date, end_date = '20090309', '20090720'
ibm_data = get_historical_prices('IBM', start_date, end_date)
msft_data = get_historical_prices('MSFT', start_date, end_date)

Convert rows into date-keyed dictionaries of dictionaries

def quote_series(series):
    columns = ['open', 'high', 'low', 'close', 'volume']
    return dict((item[0], dict(zip(columns, item[1:]))) for item in series[1:])

ibm = quote_series(ibm_data)
msft = quote_series(msft_data)

Do the intersection of dates thing

ibm_dates = set(ibm.keys())
msft_dates = set(msft.keys())

both = ibm_dates.intersection(msft_dates)

for d in sorted(both):
    print d, ibm[d], msft[d]
hughdbrown
Hey Hugh, is this structure of *item* some financial data common place that everybody is familar with?!
ThomasH
You've modelled each data item as an object, but if I got the OP right, each is an array. So your set constructors should probably go *set(item[0]...)*, your dicts like *dict((item[0], item[1:])...)* and your print statement *print d, da[d][3], db[d][3]*, since OP is only interested in the cl part of each item, right?!
ThomasH
A: 

How's that:

def intersect(seq1, seq2):
    if seq1[0] == seq2[0]:  # compare the date columns
        return (seq1[4], seq2[4])  # return 2-tuple with the cls values
ThomasH