ansaurus

Question

Count the number of occurences of letters in a Python string

Answer 1

+7 A:

for base in 'ACGT':
  print base, thesequence.count(base) + thesequence.count(base.lower())

Alex Martelli 2009-11-15 19:00:05

Thank you sire wholeheartedly.

Joshua 2009-11-15 19:02:08

Out of curiosity, is there a reason you don't do thesequence.lower().count(base.lower()), instead? I'm guessing it's to make it faster, but I'm not 100% sure.

Edan Maor 2009-11-15 19:03:47

It's not necessarily faster this way, but it takes less memory. Since DNA sequences can be **long** this can be important.

sth 2009-11-15 19:26:08

Yep, as you need to do two passes anyway, it's better to have both be counting ones (memory-thrifty) rather than have one take up O(N) extra temporary memory. If you do have memory to spare, a single `tmp = sequence.lower()` outside the loop (then loop over `'acgt'` in lowercase doing just `tmp.count(base)`) is going to be faster. A single pass with a finditer on a case-insensitive RE might be fastest, but **a lot** less simple than these approaches;-).

Alex Martelli 2009-11-15 19:52:17

ansaurus

tags:

views:

answers:

Count the number of occurences of letters in a Python string

related questions