<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Translating DNA into proteins: second approach, now using FASTA files</title>
	<atom:link href="http://python.genedrift.org/2007/07/11/translating-dna-into-proteins-second-approach-now-using-fasta-files/feed/" rel="self" type="application/rss+xml" />
	<link>http://python.genedrift.org/2007/07/11/translating-dna-into-proteins-second-approach-now-using-fasta-files/</link>
	<description>a step-by-step guide to create Python applications in bioinformatics</description>
	<lastBuildDate>Sun, 02 May 2010 04:24:01 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.1-alpha</generator>
	<item>
		<title>By: Alexandre</title>
		<link>http://python.genedrift.org/2007/07/11/translating-dna-into-proteins-second-approach-now-using-fasta-files/comment-page-1/#comment-14362</link>
		<dc:creator>Alexandre</dc:creator>
		<pubDate>Wed, 11 Jun 2008 01:45:30 +0000</pubDate>
		<guid isPermaLink="false">http://python.genedrift.org/2007/07/11/translating-dna-into-proteins-second-approach-now-using-fasta-files/#comment-14362</guid>
		<description>Here is some alternative implementations of your translate function. 

&lt;code&gt;def translate_dna_0(sequence):
    proteinseq = &#039;&#039;
    for n in range(0,len(sequence),3):
        # &quot;obj == True&quot; is ugly, use the keyword &#039;is&#039;, as &quot;obj is True&quot;
        # for testing singleton like True, False and None, or even
        # better omit it completely.
        if gencode.has_key(sequence[n:n+3]) == True:
            # String concatenation is quadratic Python. Since strings are
            # immutable, a new string object needs to be created every time.
            # However, the main Python implementation does concatenation quite
            # efficiently, so the performance benefits are not extraordinaire.
            proteinseq += gencode[sequence[n:n+3]]
    return proteinseq

def translate_dna_1(sequence):
    # Use a list to avoid the quadratic behavior of string
    # concatenation.
    proteinseq = []
    for n in range(0, len(sequence), 3):
        # Use the idiomatic &#039;in&#039; keyword to test whether a sequence
        # is in the gencode dictionary.
        if sequence[n:n+3] in gencode:
            proteinseq.append(gencode[sequence[n:n+3]])
    return &#039;&#039;.join(proteinseq)

def translate_dna_2(sequence):
    # Alternative implementation: use re.findall to split the sequence
    # into nucleotides.
    proteinseq = []
    for nucleo in re.findall(&quot;...&quot;, sequence):
        if nucleo in gencode:
            proteinseq.append(gencode[nucleo])
    return &#039;&#039;.join(proteinseq)

def translate_dna_3(sequence):
    # Alternative implementation: use re.sub with a callable to do the whole
    # translation. Short, but slow due to the function call overhead.
    def replace(match):
        return gencode[match.group(0)]
    return re.sub(&quot;...&quot;, replace, sequence)
&lt;/code&gt;</description>
		<content:encoded><![CDATA[<p>Here is some alternative implementations of your translate function. </p>
<p><code>def translate_dna_0(sequence):<br />
    proteinseq = ''<br />
    for n in range(0,len(sequence),3):<br />
        # "obj == True" is ugly, use the keyword 'is', as "obj is True"<br />
        # for testing singleton like True, False and None, or even<br />
        # better omit it completely.<br />
        if gencode.has_key(sequence[n:n+3]) == True:<br />
            # String concatenation is quadratic Python. Since strings are<br />
            # immutable, a new string object needs to be created every time.<br />
            # However, the main Python implementation does concatenation quite<br />
            # efficiently, so the performance benefits are not extraordinaire.<br />
            proteinseq += gencode[sequence[n:n+3]]<br />
    return proteinseq</p>
<p>def translate_dna_1(sequence):<br />
    # Use a list to avoid the quadratic behavior of string<br />
    # concatenation.<br />
    proteinseq = []<br />
    for n in range(0, len(sequence), 3):<br />
        # Use the idiomatic 'in' keyword to test whether a sequence<br />
        # is in the gencode dictionary.<br />
        if sequence[n:n+3] in gencode:<br />
            proteinseq.append(gencode[sequence[n:n+3]])<br />
    return ''.join(proteinseq)</p>
<p>def translate_dna_2(sequence):<br />
    # Alternative implementation: use re.findall to split the sequence<br />
    # into nucleotides.<br />
    proteinseq = []<br />
    for nucleo in re.findall("...", sequence):<br />
        if nucleo in gencode:<br />
            proteinseq.append(gencode[nucleo])<br />
    return ''.join(proteinseq)</p>
<p>def translate_dna_3(sequence):<br />
    # Alternative implementation: use re.sub with a callable to do the whole<br />
    # translation. Short, but slow due to the function call overhead.<br />
    def replace(match):<br />
        return gencode[match.group(0)]<br />
    return re.sub("...", replace, sequence)<br />
</code></p>
]]></content:encoded>
	</item>
</channel>
</rss>

