views:

170

answers:

6

I have this code here, it is supposed to remove the common letters from both the lists n1 and n2. But when i run this code it only runs once as in it removes only 'a' from both n1 and n2 and doesnt remove 'k'.

Just to clarify this code should always work on only 2 words.

name1 = "abdjek"
name2 = "doarhsnk"

n1l = list(name1)
n2l = list(name2)

for i in range(len(n1l)):
   for j in range(len(n2l)):
         if n1l[i] == n2l[j]:
               n1l.pop(i)
               n2l.pop(j)
               n1l.append('0')
               n2l.append('1')

Ok wait, it seems to work for the above 2 names but when i have name1 = "naveen" and name2 = "darshana" it doesnt work!

A: 

It's working here for me. That is, if I add print statements thus:

name1 = "abdjek"
name2 = "doarhsnk"

n1l = list(name1)
n2l = list(name2)

print "Lists before loop:"
print n1l
print n2l

for i in range(len(n1l)):
    for j in range(len(n2l)):
        if n1l[i] == n2l[j]:
           n1l.pop(i)
           n2l.pop(j)
           n1l.append('0')
           n2l.append('1')

print "Lists after loop:"
print n1l
print n2l

the characters 'a', 'd', and 'k' are all removed:

> python test.py 
Lists before loop:
['a', 'b', 'd', 'j', 'e', 'k']
['d', 'o', 'a', 'r', 'h', 's', 'n', 'k']
Lists after loop:
['b', 'j', 'e', '0', '0', '0']
['o', 'r', 'h', 's', 'n', '1', '1', '1']
Andrew Walker
ok i think it works... but when i havename1 = "naveen"name2 = "darshana"it doesnt work!!
Prabhu
+5  A: 

I suggest a much simpler approach:

def removecommon(name1, name2):
  common = set(name1).intersection(name2)
  res1 = ''.join(n for n in name1 if n not in common)
  res2 = ''.join(n for n in name2 if n not in common)
  return res1, res2

n1, n2 = removecommon('naveen', 'darshana')
print n1, n2

emits vee drsh as desired.

Edit: as the OP now specified (in a comment -- pls remember to edit your question too, oh OP!) that he actually wants to remove only the first occurrence in each word of each common letter, the needed algorithm is of course completely different. A simple approach (feasible if the length of the words is not too high):

def removefirstcommon(name1, name2):
  common = set(name1).intersection(name2)
  n1 = list(name1)
  for c in common: n1.remove(c)
  n2 = list(name2)
  for c in common: n2.remove(c)
  return ''.join(n1), ''.join(n2)

A more elaborate approach (while slower for normal-length words) would be faster for extremely long words (since the following is O(N) while the former's O(N squared)):

def removefirstcommonlongwords(name1, name2):
  common = set(name1).intersection(name2)
  def mustrem(c, copycom):
    res = c not in copycom
    copycom.discard(c)
    return res
  cop = set(common)
  n1 = [c for c in name1 if mustrem(c, cop)]
  n2 = [c for c in name2 if mustrem(c, common)]
  return ''.join(n1), ''.join(n2)
Alex Martelli
@Alex Martelli thanks for this new approach but i have 1 more problem, i have to remove only the first instance of the match, i must get an out put of 'veen' and 'darsha'. I dont want to remove all the instances of 'n' and 'a'
Prabhu
@Prabhu, how do you get 'darsha' by removing the first occurrences of a and n from 'darshana'? 'drshaa' would seem to follow the "remove the first occurrence" rule. Clarify please?
Alex Martelli
@Alex Martelli yes! sorry typo, you are right i should get drshaa
Prabhu
+1  A: 

A more pythonic approach would be to use sets and list comprehensions.

name1 = "naveen"; name2 = "darshana"

name1_set=set(name1)
name2_set=set(name2)

clean1=[x for x in  name1 if x not in name2_set]
clean2=[x for x in name2 if x not in name1_set]

clean1.extend(['0']*(len(name1)-len(clean1)))
clean2.extend(['1']*(len(name2)-len(clean2)))

print clean1,clean2

set gives us O(1) lookups, thus making the whole process faster by making it O(N) instead of O(N^2).

EDIT: In light of your later comment that the number of occurrences matter, this is the updated version that takes that into account.

name1 = "naveen"; name2 = "darshana"

count1={}
count2={}


for x in name1:
    count1[x]=count1.get(x,0)+1

for x in name2:
    count2[x]=count2.get(x,0)+1

def remove_dups(name,count,null):
    clean=[]
    for x in name:
        if count.get(x,0):
            count[x]-=1
        else:
            clean.append(x)
    clean.extend([null]*(len(name)-len(clean)))
    return clean

clean1=remove_dups(name1,count2,'0')
clean2=remove_dups(name2,count1,'1')

print clean1,clean2

It uses dicts to keep counts of occurrences. Whenever a character is removed, the corresponding count for the other name is decremented. Complexity is still O(N).

It prints ['v', 'e', 'e', 'n', '0', '0'] and ['d', 'r', 's', 'h', 'a', 'a', '1', '1']. Is that what you wanted?

MAK
@MAK yes, thank you!
Prabhu
`set` takes an iterable, so `set('foo')` works, no need to convert to an intermediate list.
Mike Graham
@Mike: Right. Forgot that string is already iterable. Answer updated, thanks.
MAK
A: 

Your code might not be working as you expect it to, as it is removing paired letters. For instance, you see an a, you then remove both a's from your words...

Bear
A: 

Your code most probably fails because you pop common letters from anywhere in the list, but append the replacements ("0" and "1") to the end of the list. They should be at position i and j, respectively.

So the loop should probably look like this:

for i in range(len(n1l)):
   for j in range(len(n2l)):
         if n1l[i] == n2l[j] and n1l[i] not in ("0", "1"):
               print "common letter ", n1l[i]
               # Replace i-th and j-th element
               n1l[i] = "0"
               n2l[j] = "1"

Anyway, there more "pythonic" ways, which are already shown in the other answers.

EDIT: Tested and working also with name1 = "naveen" / name2 = "darshana".

AndiDog
A: 

Here's some quite (IMHO quite elegant) code that runs in O(n). If word 1 has N occurrences of the letter x, it removes the first N x's from word 2 (and vice versa) -- I think this is what you want, but I could be wrong.

from collections import defaultdict

def build(s, chars_s, chars_t):
    """Return characters of s, with duplicate characters from t removed."""
    for i, char in enumerate(s):
        indexes_s, indexes_t = chars_s[char], chars_t[char]
        if len(indexes_s) > len(indexes_t) and i >= indexes_s[len(indexes_t)]:
            yield char

def rm_dup(a, b):
    """Pairwise remove duplicate letters in a and b."""
    chars_a, chars_b = defaultdict(list), defaultdict(list)
    for i, char in enumerate(a): chars_a[char].append(i)
    for i, char in enumerate(b): chars_b[char].append(i)
    return (''.join(build(a, chars_a, chars_b)),
            ''.join(build(b, chars_b, chars_a)))

print rm_dup('naveen', 'darshana')
Paul Hankin