<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Obtaining overrepresented motifs in DNA sequences, part 12.5</title>
	<atom:link href="http://python.genedrift.org/2008/08/15/obtaining-overrepresented-motifs-in-dna-sequences-part-12/feed/" rel="self" type="application/rss+xml" />
	<link>http://python.genedrift.org/2008/08/15/obtaining-overrepresented-motifs-in-dna-sequences-part-12/</link>
	<description>a step-by-step guide to create Python applications in bioinformatics</description>
	<lastBuildDate>Mon, 22 Feb 2010 18:22:18 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=3.0-alpha</generator>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
		<item>
		<title>By: Paulo Nuin</title>
		<link>http://python.genedrift.org/2008/08/15/obtaining-overrepresented-motifs-in-dna-sequences-part-12/comment-page-1/#comment-17648</link>
		<dc:creator>Paulo Nuin</dc:creator>
		<pubDate>Fri, 15 Aug 2008 13:12:52 +0000</pubDate>
		<guid isPermaLink="false">http://python.genedrift.org/?p=137#comment-17648</guid>
		<description>Thanks Andrew, I will change the code and post the results.</description>
		<content:encoded><![CDATA[<p>Thanks Andrew, I will change the code and post the results.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Andrew Dalke</title>
		<link>http://python.genedrift.org/2008/08/15/obtaining-overrepresented-motifs-in-dna-sequences-part-12/comment-page-1/#comment-17631</link>
		<dc:creator>Andrew Dalke</dc:creator>
		<pubDate>Fri, 15 Aug 2008 06:24:04 +0000</pubDate>
		<guid isPermaLink="false">http://python.genedrift.org/?p=137#comment-17631</guid>
		<description>I sent a couple of links to binomial function implementations without reviewing the code myself.  You used the second of those links, which happens to be amazingly slow.  It&#039;s actually computing all of the values of the distribution for a given size n, then returning the one requested.

Choosing one of the implementations from the first link (the &quot;non-clever&quot; one at http://groups.google.com/group/comp.lang.python/msg/6e7c3358b086ff9c?dmode=source):

def choose(n, k):
   if 0 &lt;= k &lt;= n:
       ntok = 1
       ktok = 1
       for t in xrange(1, min(k, n - k) + 1):
          ntok *= n
          ktok *= t
          n -= 1
       return ntok // ktok
   else:
       return 0

I get as output

fac: 0.01
choose: 0.00

Since it doesn&#039;t actually take 0.0 time I changed the repeat count from 10 to 10,000 and removed the &quot;/0&quot;.  This will give you an idea of the performance difference:

fac: 140.19
choose: 0.05

or roughly a factor of 3,000 times faster.

(BTW,  I had a hard time writing &quot;binomial&quot; - my fingers keep wanting to type &quot;bionomial&quot; :)</description>
		<content:encoded><![CDATA[<p>I sent a couple of links to binomial function implementations without reviewing the code myself.  You used the second of those links, which happens to be amazingly slow.  It&#8217;s actually computing all of the values of the distribution for a given size n, then returning the one requested.</p>
<p>Choosing one of the implementations from the first link (the &#8220;non-clever&#8221; one at <a href="http://groups.google.com/group/comp.lang.python/msg/6e7c3358b086ff9c?dmode=source)" rel="nofollow">http://groups.google.com/group/comp.lang.python/msg/6e7c3358b086ff9c?dmode=source)</a>:</p>
<p>def choose(n, k):<br />
   if 0 &lt;= k &lt;= n:<br />
       ntok = 1<br />
       ktok = 1<br />
       for t in xrange(1, min(k, n &#8211; k) + 1):<br />
          ntok *= n<br />
          ktok *= t<br />
          n -= 1<br />
       return ntok // ktok<br />
   else:<br />
       return 0</p>
<p>I get as output</p>
<p>fac: 0.01<br />
choose: 0.00</p>
<p>Since it doesn&#8217;t actually take 0.0 time I changed the repeat count from 10 to 10,000 and removed the &#8220;/0&#8243;.  This will give you an idea of the performance difference:</p>
<p>fac: 140.19<br />
choose: 0.05</p>
<p>or roughly a factor of 3,000 times faster.</p>
<p>(BTW,  I had a hard time writing &#8220;binomial&#8221; &#8211; my fingers keep wanting to type &#8220;bionomial&#8221; <img src='http://python.genedrift.org/wordpress/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
]]></content:encoded>
	</item>
</channel>
</rss>

