views:

103

answers:

3

Hi,

I have a task to match multiple events(facts) with each other by some their properties. As a result of events matching some action should be generated. Action can be generated when events of all exists types were matched.

Is there any algorithm which could be used for such task? Or any direction?

Thanks

Example: We have several events with different types and properties. Type SEEN is cumulative event (several events could be merged for matching) and type FOUND is not.

Event 1 (SEEN):
DATE="2009-09-30"
EYES_COLOR="BLUE"
LEFT_SOCK_COLOR="RED"

Event 2 (SEEN):
DATE="2009-09-30"
EYES_COLOR="BLUE"
RIGHT_SOCK_COLOR="GREEN"

Event 3 (FOUND):
DATE="2009-09-30"
EYES_COLOR="BLUE"
LEFT_SOCK_COLOR="BLUE"
RIGHT_SOCK_COLOR="GREEN"
PLACE="MARKET"

Event 4 (FOUND):
DATE="2009-09-30"
EYES_COLOR="BLUE"
LEFT_SOCK_COLOR="GREEN"
PLACE="SHOP"

Event 5 (FOUND):
DATE="2009-09-30"
EYES_COLOR="BLUE"
PLACE="AIRPORT"

For above events such actions should be generated (by composing matched events):

Action 1_2_3:
DATE="2009-09-30"
EYES_COLOR="BLUE"
LEFT_SOCK_COLOR="RED"
RIGHT_SOCK_COLOR="GREEN"
PLACE="MARKET"

Action 2_4:
DATE="2009-09-30"
EYES_COLOR="BLUE"
LEFT_SOCK_COLOR="GREEN"
PLACE="SHOP"

Means:

Event 1 + Event 2 + Event 3 => Action 1_2_3
Event 2 + Event 4 => Action 2_4
Event 5 does not match with anything.
+1  A: 

Hi,

in your case every two events are either compatible or not; we can denote this by C(e,e'), meaning that event e is compatible with event e'. You can build a maximal set of compatible events of course iteratively; when you have a set {e1,e2,...,en} of compatible events, you can add e' to the set if and only if e' is compatible with every e1,...,en, i.e. C(ei,e') is true for all 1<=i<=n.

Unfortunately in your case the number of maximal sets of compatible events can be exponential to the number of events, because you can have e.g. events e1, e2, e3 and e4 so that they are all pair-wisely compatible but none of them is compatible with TWO other events; for this set you will already get 6 different "actions", and they overlap each other.

A simple algorithm is to have a recursive search where you add events one by one to the prospectual "action", and when you can't add any more events you register the action; then you backtrack. It's called "backtracking search". You can improve its running time then by proper datastructures for "quickly" looking up the matching events.

As in the comment, the question about SEEN/FOUND is open; I'm assuming here that the fields are merged "as is".

antti.huima
Thanks, I think as a first step "backtracking search" is quite straightforward algorithm.
Andrey Vityuk
A: 

This pseudo-code may help: (C# syntax)

foreach (var found in events.Where(x => x.EventType == "Found"))
{
    var matches = events.Where(x => x.EventType == "Seen"
                               && x.Whatever == found.Whatever);
    if (matches.Count() > 0)
    {
        // Create an action based on the single "Found" event 
        // and the multiple matching "Seen" events.
    }
}
John Fisher
Yes, but there could any number of event types like "Found" and "Seen"
Andrey Vityuk
If each type has a different action, you will need something to code for each of them. So, multiple blocks like this would be one solution.
John Fisher
A: 

I'm not sure I understand the question correctly. It seems that for every FOUND event, you want to identify all matching SEEN events and merge them? Python code:

# assume events are dictionaries, and you have 2 lists of them by type:

# (omitting DATE because it's always "2009-09-03" in your example)
seen_events = [
    {
        "EYES_COLOR": "BLUE",
        "LEFT_SOCK_COLOR": "RED",
    },
    {
        "EYES_COLOR": "BLUE",
        "RIGHT_SOCK_COLOR": "GREEN",
    },
]

found_events = [    
    {
        "EYES_COLOR": "BLUE",
        "LEFT_SOCK_COLOR": "BLUE",
        "RIGHT_SOCK_COLOR": "GREEN",
        "PLACE": "MARKET",
    },
    {
        "EYES_COLOR": "BLUE",
        "LEFT_SOCK_COLOR": "GREEN",
        "PLACE": "SHOP",
    },
    {
        "EYES_COLOR": "BLUE",
        "PLACE": "AIRPORT",
    },
]

def do_action(seen_events, found):
    """DUMMY"""
    for seen in seen_events:
        print seen
    print found
    print

# brute force
for found in found_events:
    matching = []
    for seen in seen_events:
        for k in found:
            if k in seen and seen[k] != found[k]:
                break
        else:  # for ended without break (Python syntax)
            matching.append(seen)
    if matching:
        do_action(matching, found)

which prints:

{'EYES_COLOR': 'BLUE', 'RIGHT_SOCK_COLOR': 'GREEN'}
{'EYES_COLOR': 'BLUE', 'PLACE': 'MARKET', 'LEFT_SOCK_COLOR': 'BLUE', 'RIGHT_SOCK_COLOR': 'GREEN'}

{'EYES_COLOR': 'BLUE', 'RIGHT_SOCK_COLOR': 'GREEN'}
{'EYES_COLOR': 'BLUE', 'PLACE': 'SHOP', 'LEFT_SOCK_COLOR': 'GREEN'}

{'EYES_COLOR': 'BLUE', 'LEFT_SOCK_COLOR': 'RED'}
{'EYES_COLOR': 'BLUE', 'RIGHT_SOCK_COLOR': 'GREEN'}
{'EYES_COLOR': 'BLUE', 'PLACE': 'AIRPORT'}

Right, this is not effecient - O(n*m) - but does this even describe the problem correctly?

Beni Cherniavsky-Paskin