views:

59

answers:

1

I'm making a program to calculate latency from a tcpdump/pcap file and I want to be able to specify rules on the command line to correlate packets -- i.e. find the time taken between sending a packet matching rule A to receiving a packet matching rule B (concrete example would be a FIX NewOrderSingle being sent and a corresponding FIX ExecutionReport being received).

This is an example of the fields in the packet (before they've been converted into dictionary form) -- I'm testing the numerical version of the field (in parentheses) rather than the English version:

    BeginString (8): FIX.4.2
    BodyLength (9): 132
    MsgType (35): D (ORDER SINGLE)
    SenderCompID (49): XXXX
    TargetCompID (56): EXCHANGE
    MsgSeqNum (34): 1409104
    SendingTime (52): 20100723-12:49:52.296
    Side (54): 1 (BUY)
    Symbol (55): A002
    ClOrdID (11): BUY704552
    OrderQty (38): 1000
    OrdType (40): 2 (LIMIT)
    Price (44): 130002
    TimeInForce (59): 3 (IMMEDIATE OR CANCEL)
    QuoteID (117): A002
    RelatdSym (46): A002
    CheckSum (10): 219 [correct]

Currently I have the arguments coming off the command line into a nested list:

[[35, 'D'], [55, 'A002']]

(where the first element of each sublist is the field number and second is the value)

I've tried iterating over this list of rules to accumulate a lambda expression:

for field, value in args.send["fields_filter"]:
    if matchers["send"] == None:
        matchers["send"] = lambda fix : field in fix and fix[field] == value
    else:
        matchers["send"] = lambda fix : field in fix and fix[field] == value and matchers["send"](fix)

When I run the program though, I get the output:

RuntimeError: maximum recursion depth exceeded in cmp

Lambdas are late-binding? So does this apply to all identifiers in the expression or just those passed in as arguments? It seems the former is true

What's the best way to achieve this functionality? I feel like I'm going about this the wrong way currently. Maybe this is a bad use of lambda expressions, but I don't know a better alternative for this.

+2  A: 

Don't use lambdas. They are late binding. Perhaps you want a partial from functools, but even that seems too complex.

Your data coming in has field names, numbers and values, right?

Your command-line parameters use field numbers and values, right?

You want a dictionary keyed by field number. In that case, you don't need any complex lookups. You just want something like this.

def match( packet_dict, criteria_list ):
    t = [ packet_dict[f] == v for f,v in criteria_list ]
    return any( t )

Something like that should handle everything for you.

S.Lott
Thanks for the suggestion of partial, I'm looking at it.I was trying to avoid doing an iteration each time as I'm running this against dumps of millions of packets and figured that generating some sort of matching function at the beginning and calling that each time would be much more efficient. I'll give it a go though and see how well it performs.
davedavedave
@davedavedave: A partial won't help much. The dictionary hashed lookup is instant. The criteria loop is small with remarkably little overhead.
S.Lott