views:

194

answers:

1

I want to get the union of 2 nested lists plus an index to the common values.

I have two lists like A = [[1,2,3],[4,5,6],[7,8,9]] and B = [[1,2,3,4],[3,3,5,7]] but the length of each list is about 100 000. To A belongs an index vector with len(A): I = [2,3,4]

What I want is to find all sublists in B where the first 3 elements are equal to a sublist in A. In this example I want to get B[0] returned ([1,2,3,4]) because its first three elements are equal to A[0]. In addition, I also want the index to A[0] in this example, that is I[0].

I tried different things, but nothing worked so far :(

First I tried this:

Common = []

for i in range(len(B)):

   if B[i][:3] in A:

      id = [I[x] for x,y in enumerate(A) if y == B[i][:3]][0]
         ctdCommon.append([int(id)] + B[i])   

But that takes ages, or never finishes

Then I transformed A and B into sets and took the union from both, which was very quick, but then I don't know how to get the corresponding indices

Does anyone have an idea?

+1  A: 

Create an auxiliary dict (work is O(len(A)) -- assuming the first three items of a sublist in A uniquely identify it (otherwise you need a dict of lists):

aud = dict((tuple(a[:3]), i) for i, a in enumerate(A))

Use said dict to loop once on B (work is O(len(B))) to get B sublists and A indices:

result = [(b, aud[tuple(b[:3])]) for b in B if tuple(b[:3]) in aud]
Alex Martelli
Well done for working out enough of what's being asked to produce an answer. My brain hurts trying to understand what it is he wants.
MattH
That worked and superquick! Thank you so much!
sbas