Obtaining overrerpresented motifs in DNA sequences, part 8
Phase 2, motifs June 3rd, 2008We keep on introducing Mike’s functions. This time there are a couple of Python methods that we haven’t seen here and need some introduction, izip and count. To use these two we also need to import new modules
from itertools import count, izip
count returns consecutive integers starting at a defined point (the method’s parameter). If empty it starts from zero. Basically, by starting a count it will give an iterable with a increasing integer values, in a fashion similar to a function with yield. Every time our loop accesses the count it will “remember” the last return value and increment it by one.
izip also returns an iterator, but from a list of iterables. It is basically used to iterate through a list of many iterables at the same time. In the function below it is used twice: one to generate a tuple (with count) with a sequence number and the sequence itself. The sequence in the tuple is then used in another izip to create the windows on the sequences to count motifs.
def get_quorums_06(seqs, mlen):
"""
add seq id_no to a set
use 'izip(count(),...) to create seq_no
use 'izip(count(),range(...)) to create start/stop indices for motifs
"""
quorum = defaultdict(set)
for id_no, seq in izip(count(), seqs):
for s, e in izip(count(), range(mlen, len(seq))):
quorum[seq[s:e]].add(id_no)
return quorum
In the next couple of posts we still be checking motif quorum functions. Stay tuned.
Recent Comments