<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Beginning Python for Bioinformatics &#187; Phase 2</title>
	<atom:link href="http://python.genedrift.org/category/phase-2/feed/" rel="self" type="application/rss+xml" />
	<link>http://python.genedrift.org</link>
	<description>a step-by-step guide to create Python applications in bioinformatics</description>
	<lastBuildDate>Wed, 10 Mar 2010 13:03:32 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=3.0-alpha</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Preview of Python Testing Beginner&#8217;s Guide</title>
		<link>http://python.genedrift.org/2010/02/22/preview-of-python-testing-beginners-guide/</link>
		<comments>http://python.genedrift.org/2010/02/22/preview-of-python-testing-beginners-guide/#comments</comments>
		<pubDate>Mon, 22 Feb 2010 16:00:53 +0000</pubDate>
		<dc:creator>Paulo Nuin</dc:creator>
				<category><![CDATA[Phase 2]]></category>
		<category><![CDATA[book]]></category>
		<category><![CDATA[preview]]></category>
		<category><![CDATA[python testing]]></category>

		<guid isPermaLink="false">http://python.genedrift.org/?p=347</guid>
		<description><![CDATA[I was invited by Packt Publishing to review Python Testing Beginner&#8217;s Guide by Daniel Arbuckle. This is a book on one of the most important aspects of scientific programming (even though the majority of scientific software don&#8217;t have any testing routines): code testing, checking if your code actually does what is intended to do. I [...]]]></description>
			<content:encoded><![CDATA[<p>I was invited by Packt Publishing to review <strong><a href="http://www.packtpub.com/python-testing-beginners-guide/book/mid/240210aaphjg">Python Testing Beginner&#8217;s Guide</a></strong> by Daniel Arbuckle. This is a book on one of the most important aspects of scientific programming (even though the majority of scientific software don&#8217;t have any testing routines): code testing, checking if your code actually does what is intended to do. I can say I&#8217;m not really an expert on testing so I guess I&#8217;m the right audience for it:</p>
<blockquote><p>You&#8217;ll learn about several of Python&#8217;s automated testing tools, and you&#8217;ll learn about the philosophies and methodologies that they were designed to support, like unit testing and test-driven development. When you&#8217;re done, you&#8217;ll be able to produce thoroughly tested code faster and more easily than ever before, and you&#8217;ll be able to do it in a way that doesn&#8217;t distract you from your &#8220;real&#8221; programming.</p></blockquote>
<p>Packt also supplied a preview/sample chapter that you can download <a href='http://python.genedrift.org/wordpress/wp-content/uploads/2010/02/8846-python-testing-beginners-guide-sample-chapter-5-when-doctest-isnt-enough-unittest-to-the-rescue.pdf'>here</a>.</p>
<p>I hope to get a review ready by the end of the week. before the Ontario Institute of Cancer Research retreat, otherwise I will try to post a full review next week.</p>
]]></content:encoded>
			<wfw:commentRss>http://python.genedrift.org/2010/02/22/preview-of-python-testing-beginners-guide/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Preliminary review of Python for Bioinformatics by Sebastian Bassi</title>
		<link>http://python.genedrift.org/2010/01/03/preliminary-review-of-python-for-bioinformatics-by-sebastian-bassi/</link>
		<comments>http://python.genedrift.org/2010/01/03/preliminary-review-of-python-for-bioinformatics-by-sebastian-bassi/#comments</comments>
		<pubDate>Sun, 03 Jan 2010 19:25:00 +0000</pubDate>
		<dc:creator>Paulo Nuin</dc:creator>
				<category><![CDATA[Phase 2]]></category>
		<category><![CDATA[biopython]]></category>
		<category><![CDATA[book]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[python]]></category>
		<category><![CDATA[review]]></category>

		<guid isPermaLink="false">http://python.genedrift.org/?p=339</guid>
		<description><![CDATA[Let me start by saying that Python for Bioinformatics (Chapman &#38; Hall/Crc Mathematical &#38; Computational Biology) is a massive book, massive in a way that it contains a lot of material. I still didn&#8217;t have enough time to check everything, but I&#8217;m well into the first section of the book that gives an initial view [...]]]></description>
			<content:encoded><![CDATA[<p>Let me start by saying that <a href="http://www.amazon.com/gp/product/1584889292?ie=UTF8&amp;tag=genedrift-20&amp;linkCode=as2&amp;camp=1789&amp;creative=9325&amp;creativeASIN=1584889292">Python for Bioinformatics (Chapman &amp; Hall/Crc Mathematical &amp; Computational Biology)</a><img src="http://www.assoc-amazon.com/e/ir?t=genedrift-20&amp;l=as2&amp;o=1&amp;a=1584889292" alt="" style="border: medium none  ! important; margin: 0px ! important;" border="0" height="1" width="1"> is a massive book, massive in a way that it contains a lot of material. I still didn&#8217;t have enough time to check everything, but I&#8217;m well into the first section of the book that gives an initial view of Python and how to program it. </p>
<p>The initial section of the book is well written (I&#8217;m not going criticize the book in terms of good/poor English, as I&#8217;m not well qualified to do that), and gives a clear perspective on how to program Python for scientists, who are the main target demographic of the book. Of course, it always help to have some basic knowledge of command line shells, but the book also includes some explanations of IDLE and other Python-capable IDEs. I cannot say that I read this section with the enough care and attention, but what I can say is that you won&#8217;t miss a beat with PfB, as it has more material than I expected. I still have to start with the more advanced topics, like <a class="zem_slink freebase/en/biopython" href="http://biopython.org/" title="Biopython" rel="homepage">BioPython</a> and so forth, what I plan to do in the coming month, and as I don&#8217;t have a lot of experience with BioPython, I&#8217;m looking forward to it. </p>
<p>On the other hand I have a small-ish complaint, that maybe is more about style than substance. I don&#8217;t like the design of the book, the way the code interleaves with the text and the way the code explanations are presented. Most of the code blocks are followed by a careful explanation, but this explanation works as a figure label for the code block. That is quite annoying because there are too many stops in the text fluidity as one tends to lose attention to it (my case, not exactly everyone&#8217;s).  </p>
<p>Another minor detail is the use of &#8220;he&#8221; every time scientists are referred (one example is on page 3 on the second phrase of the introduction). The (politically) correct would be to use &#8220;he or she&#8221; or &#8220;she or he&#8221; (but that&#8217;s OK with me).</p>
<p>I will try to post more complete reviews of the sections that I don&#8217;t master. I would also like to thank Sebastian for sending me a copy of the book.</p>
<div style="margin-top: 10px; height: 15px;" class="zemanta-pixie"><a class="zemanta-pixie-a" href="http://reblog.zemanta.com/zemified/99930d7a-5cef-4e74-b46c-71e031c4d627/" title="Reblog this post [with Zemanta]"><img style="border: medium none ; float: right;" class="zemanta-pixie-img" src="http://img.zemanta.com/reblog_e.png?x-id=99930d7a-5cef-4e74-b46c-71e031c4d627" alt="Reblog this post [with Zemanta]"></a><span class="zem-script more-related more-info pretty-attribution"><script type="text/javascript" src="http://static.zemanta.com/readside/loader.js" defer="defer"></script></span></div>
]]></content:encoded>
			<wfw:commentRss>http://python.genedrift.org/2010/01/03/preliminary-review-of-python-for-bioinformatics-by-sebastian-bassi/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Wiki</title>
		<link>http://python.genedrift.org/2009/05/11/wiki/</link>
		<comments>http://python.genedrift.org/2009/05/11/wiki/#comments</comments>
		<pubDate>Mon, 11 May 2009 22:43:31 +0000</pubDate>
		<dc:creator>Paulo Nuin</dc:creator>
				<category><![CDATA[Phase 2]]></category>
		<category><![CDATA[bioinformatics]]></category>
		<category><![CDATA[python]]></category>
		<category><![CDATA[wiki]]></category>

		<guid isPermaLink="false">http://python.genedrift.org/?p=294</guid>
		<description><![CDATA[



Image via Wikipedia



I&#8217;m slowly moving the posts from the blog to a wiki. It makes easier to display post series and allows people to modify/enhance/discuss.
The wiki address is http://wiki.genedrift.org.

]]></description>
			<content:encoded><![CDATA[<div class="zemanta-img" style="margin: 1em; display: block;">
<div>
<dl style="width: 310px;" class="wp-caption alignright">
<dt class="wp-caption-dt"><a href="http://commons.wikipedia.org/wiki/Image:History_comparison_example.png"><img src="http://upload.wikimedia.org/wikipedia/commons/thumb/3/3d/History_comparison_example.png/300px-History_comparison_example.png" alt="History comparison reports highlight the chang..." title="History comparison reports highlight the chang..." width="300" height="263"></a></dt>
<dd class="wp-caption-dd zemanta-img-attribution" style="font-size: 0.8em;">Image via <a href="http://commons.wikipedia.org/wiki/Image:History_comparison_example.png">Wikipedia</a></dd>
</dl>
</div>
</div>
<p>I&#8217;m slowly moving the posts from the blog to a wiki. It makes easier to display post series and allows people to modify/enhance/discuss.</p>
<p>The wiki address is <a href="http://wiki.genedrift.org">http://wiki.genedrift.org</a>.</p>
<div style="margin-top: 10px; height: 15px;" class="zemanta-pixie"><a class="zemanta-pixie-a" href="http://reblog.zemanta.com/zemified/ddf23630-4b1e-4cb1-8363-47b964768c18/" title="Reblog this post [with Zemanta]"><img style="border: medium none ; float: right;" class="zemanta-pixie-img" src="http://img.zemanta.com/reblog_a.png?x-id=ddf23630-4b1e-4cb1-8363-47b964768c18" alt="Reblog this post [with Zemanta]"></a><span class="zem-script more-related pretty-attribution"><script type="text/javascript" src="http://static.zemanta.com/readside/loader.js" defer="defer"></script></span></div>
]]></content:encoded>
			<wfw:commentRss>http://python.genedrift.org/2009/05/11/wiki/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Managing a simple database with Python, SQLite and wxPython, 8</title>
		<link>http://python.genedrift.org/2009/04/22/managing-a-simple-database-with-python-sqlite-and-wxpython-8/</link>
		<comments>http://python.genedrift.org/2009/04/22/managing-a-simple-database-with-python-sqlite-and-wxpython-8/#comments</comments>
		<pubDate>Wed, 22 Apr 2009 15:04:17 +0000</pubDate>
		<dc:creator>Paulo Nuin</dc:creator>
				<category><![CDATA[Phase 2]]></category>
		<category><![CDATA[wxPython]]></category>
		<category><![CDATA[bioinformatics]]></category>
		<category><![CDATA[python]]></category>
		<category><![CDATA[SQLite]]></category>

		<guid isPermaLink="false">http://python.genedrift.org/?p=282</guid>
		<description><![CDATA[



Image via Wikipedia



Thanks to the comments and suggestions to the last post, it&#8217;s possible to make now a more pythonic and clearly generic database update class. Let&#8217;s check how the &#8220;generic&#8221; update/edit entry function is currently:

def update_data(self, values_list):
    &#039;&#039;&#039;edits and updates fields&#039;&#039;&#039;

    if sys.platform == &#039;darwin&#039;:
    [...]]]></description>
			<content:encoded><![CDATA[<div class="zemanta-img" style="margin: 1em; display: block;">
<div>
<dl style="width: 210px;" class="wp-caption alignright">
<dt class="wp-caption-dt"><a href="http://commons.wikipedia.org/wiki/Image:Gene.png"><img src="http://upload.wikimedia.org/wikipedia/commons/thumb/0/07/Gene.png/200px-Gene.png" alt="Diagram of the location of introns and exons w..." title="Diagram of the location of introns and exons w..." height="160" width="200"></a></dt>
<dd class="wp-caption-dd zemanta-img-attribution" style="font-size: 0.8em;">Image via <a href="http://commons.wikipedia.org/wiki/Image:Gene.png">Wikipedia</a></dd>
</dl>
</div>
</div>
<p>Thanks to the comments and suggestions to the last post, it&#8217;s possible to make now a more pythonic and clearly generic database update class. Let&#8217;s check how the &#8220;generic&#8221; update/edit entry function is currently:</p>
<pre name="code" class="python">
def update_data(self, values_list):
    &#039;&#039;&#039;edits and updates fields&#039;&#039;&#039;

    if sys.platform == &#039;darwin&#039;:
        (cursor, database) = link_db(self.db_path)
    else:
        (cursor, database) = link_db()

    cursor.execute(&quot;UPDATE bac SET  projects = ?, comments = ?, temperature = ?, cell = ?, box = ?, tubes = ?, chromosome = ?, sdate = ?, clone = ?, source
	= ?, location1 = ?, startpos = ?, endpos = ?,
	gene = ?, genelink = ?, dnaex = ?, validation = ?, pcr = ?, refs = ?, antibiotic = ? WHERE idbac = ?&quot;,
    values_list[&#039;projects&#039;], values_list[&#039;comments&#039;], values_list[&#039;temperature&#039;], values_list[&#039;cell&#039;], values_list[&#039;box&#039;], values_list[&#039;tubes&#039;],
    values_list[&#039;chromo&#039;], values_list[&#039;date&#039;], values_list[&#039;clone&#039;], values_list[&#039;source&#039;], values_list[&#039;location&#039;], values_list[&#039;start&#039;]
    values_list[&#039;end&#039;], values_list[&#039;gene&#039;], values_list[&#039;genelink&#039;], values_list[&#039;dna&#039;], values_list[&#039;validation&#039;], values_list[&#039;pcr&#039;],
    values_list[&#039;refs&#039;], values_list[&#039;antibiotic&#039;], values_list[&#039;idbac&#039;]))

    database.commit()
    database.close()
</pre>
<p>which is really ugly and, although it works, is not really useful outside this small project. Based on the comments the best option was to use placeholders and a dictionary, similar to the approach used on the insert data function. Pre-formatting a string to have both the field name to be updated and a placeholder (for instance <code>:idbac</code>) that will receive the values</p>
<pre name="code" class="python">
update = &#039;,&#039;.join([&#039;%s=:%s&#039; % (y, y) for y in values_list])
</pre>
<p>where update is the string we want and values_list is the dictionary with all the key-value pairs. I tried this approach, using this structure in the generic function, but then I decided that the best alternative was to put this <code>join</code> in the derived class function and pre-populate the string with the values and then send this string directly to the update function. In the end I opted to use this </p>
<pre name="code" class="python">
update = &#039;,&#039;.join([&#039;%s=\&quot;%s\&quot;&#039; % (y, values_list[y]) for y in values_list])
</pre>
<p>The latter is slightly different to what was suggested. The original one would create a tuple with the keys from the dictionary, making for instance <code>sdate:sdate</code>. With all these place holders just pass the dictionary and you have all the values inserted. This would be handy if the insert string was being created on the &#8220;generic&#8221; function. If we move this to the derived class, we can use the the alternative, keeping in mind that the values parsed should be surrounded by quotes, otherwise the SQL UPDATE statement will have problems with spaces and other foreign characters that should not be there. So instead of placeholders we will have <code>gene:"<a class="zem_slink" href="http://en.wikipedia.org/wiki/PTEN_%28gene%29" title="PTEN (gene)" rel="wikipedia">PTEN</a>"</code> and we can attache this joined string to the actual commands. We then can move all the machinery from the &#8220;generic&#8221; function that can be written as</p>
<pre name="code" class="python">
def update_data(self, update_string):
    &#039;&#039;&#039;edits and updates fields&#039;&#039;&#039;

    if sys.platform == &#039;darwin&#039;:
        (cursor, database) = link_db(self.db_path)
    else:
        (cursor, database) = link_db()
    cursor.execute(update_string)

    database.commit()
    database.close()
</pre>
<p>That&#8217;s it, very elegant (we will see the derived class in the next post). And finishing our generic class, we would need a delete function, so the user can eliminate entries that he/she doesn&#8217;t want anymore. It&#8217;s also a very simple function</p>
<pre name="code" class="python">
def delete_data(self, delete_string):
    &#039;&#039;&#039;deletes one field&#039;&#039;&#039;

    if sys.platform == &#039;darwin&#039;:
        (cursor, database) = link_db(self.db_path)
    else:
        (cursor, database) = link_db()
    cursor.execute(delete_string)

    database.commit()
    database.close()
</pre>
<p>We will check the delete string next time. Again, I would like to thank for all the comments, it has been really helpful for me.</p>
<p>Previously in the series:<br />
<a href="http://python.genedrift.org/2009/02/09/managing-a-simple-database-with-python-sqlite-and-wxpython-1/">Part 1</a><br />
<a href="http://python.genedrift.org/2009/02/17/managing-a-simple-database-with-python-sqlite-and-wxpython-2/">Part 2</a><br />
<a href="http://python.genedrift.org/2009/02/18/managing-a-simple-database-with-python-sqlite-and-wxpython-3/">Part 3</a><br />
<a href="http://python.genedrift.org/2009/03/02/managing-a-simple-database-with-python-sqlite-and-wxpython-4/">Part 4</a><br />
<a href="http://python.genedrift.org/2009/03/02/managing-a-simple-database-with-python-sqlite-and-wxpython-5/">Part 5</a><br />
<a href="http://python.genedrift.org/2009/03/31/managing-a-simple-database-with-python-sqlite-and-wxpython-6/">Part 6</a><br />
<a href="http://python.genedrift.org/2009/04/20/managing-a-simple-database-with-python-sqlite-and-wxpython-7-includes-a-question/">Part 7</a></p>
<div style="margin-top: 10px; height: 15px;" class="zemanta-pixie"><a class="zemanta-pixie-a" href="http://reblog.zemanta.com/zemified/e8dc77f5-e3de-4d4f-8ec1-8c0006225743/" title="Reblog this post [with Zemanta]"><img style="border: medium none ; float: right;" class="zemanta-pixie-img" src="http://img.zemanta.com/reblog_a.png?x-id=e8dc77f5-e3de-4d4f-8ec1-8c0006225743" alt="Reblog this post [with Zemanta]"></a><span class="zem-script more-related pretty-attribution"><script type="text/javascript" src="http://static.zemanta.com/readside/loader.js" defer="defer"></script></span></div>
]]></content:encoded>
			<wfw:commentRss>http://python.genedrift.org/2009/04/22/managing-a-simple-database-with-python-sqlite-and-wxpython-8/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Managing a simple database with Python, SQLite and wxPython, 7 (includes a question)</title>
		<link>http://python.genedrift.org/2009/04/20/managing-a-simple-database-with-python-sqlite-and-wxpython-7-includes-a-question/</link>
		<comments>http://python.genedrift.org/2009/04/20/managing-a-simple-database-with-python-sqlite-and-wxpython-7-includes-a-question/#comments</comments>
		<pubDate>Mon, 20 Apr 2009 17:21:59 +0000</pubDate>
		<dc:creator>Paulo Nuin</dc:creator>
				<category><![CDATA[Phase 2]]></category>
		<category><![CDATA[wxPython]]></category>
		<category><![CDATA[bioinformatics]]></category>
		<category><![CDATA[python]]></category>
		<category><![CDATA[SQLite]]></category>

		<guid isPermaLink="false">http://python.genedrift.org/2009/04/20/managing-a-simple-database-with-python-sqlite-and-wxpython-7-includes-a-question/</guid>
		<description><![CDATA[And we&#8217;re back. After a couple of weeks of inactivity we will get back to our small soap-opera pf Python, wxPython and SQLite. Continuing in our database management code let&#8217;s check two other functions that changed since our first inception of the code. The first one is the insert_data function that looks like this now

def [...]]]></description>
			<content:encoded><![CDATA[<p>And we&#8217;re back. After a couple of weeks of inactivity we will get back to our small soap-opera pf Python, wxPython and SQLite. Continuing in our database management code let&#8217;s check two other functions that changed since our first inception of the code. The first one is the <code>insert_data</code> function that looks like this now</p>
<pre name="code" class="python">
def insert_data(self, values_list, insert_string):
    &#039;&#039;&#039;inserts data in the database&#039;&#039;&#039;

    if sys.platform == &#039;darwin&#039;:
        (cursor, database) = link_db(self.db_path)
    else:
        (cursor, database) = link_db()

    cursor.execute(insert_string % self.table_name, values_list)

    database.commit()
    database.close()
</pre>
<p>Basically no changes, apart from the obvious check for the current running operating system, which was explained in the last post. The other function to check is the <code>update_data</code>. This function is new and it wasn&#8217;t in the first version, but as it can be seen it has a problem being a &#8220;generic&#8221; function, because it contains information pertained to the table and database being used in the interface. This function basically received information that needs to be updated in the table&#8217;s fields and by using the SQL <code>UPDATE ... SET</code> edits and updates data in the changed fields. I have tried several different syntaxes to make the execute generic, mainly trying to pre-format the string without success. IF anyone reading this can help, I&#8217;d appreciate.</p>
<pre name="code" class="python">
def update_data(self, values_list):
    &#039;&#039;&#039;edits and updates fields&#039;&#039;&#039;

    if sys.platform == &#039;darwin&#039;:
        (cursor, database) = link_db(self.db_path)
    else:
        (cursor, database) = link_db()

    cursor.execute(&quot;UPDATE bac SET  projects = ?, comments = ?, temperature = ?, cell = ?, box = ?, tubes = ?, chromosome = ?, sdate = ?, clone = ?, source = ?, location1 = ?, startpos = ?, endpos = ?,
	gene = ?, genelink = ?, dnaex = ?, validation = ?, pcr = ?, refs = ?, antibiotic = ? WHERE idbac = ?&quot;,
    values_list[&#039;projects&#039;], values_list[&#039;comments&#039;], values_list[&#039;temperature&#039;], values_list[&#039;cell&#039;], values_list[&#039;box&#039;], values_list[&#039;tubes&#039;],
    values_list[&#039;chromo&#039;], values_list[&#039;date&#039;], values_list[&#039;clone&#039;], values_list[&#039;source&#039;], values_list[&#039;location&#039;], values_list[&#039;start&#039;],  values_list[&#039;end&#039;],
    values_list[&#039;gene&#039;], values_list[&#039;genelink&#039;], values_list[&#039;dna&#039;], values_list[&#039;validation&#039;], values_list[&#039;pcr&#039;],
    values_list[&#039;refs&#039;], values_list[&#039;antibiotic&#039;], values_list[&#039;idbac&#039;]))

    database.commit()
    database.close()
</pre>
<p>Anyway, I will explain the logic of the command (OK for a stop gap, but not as a definite solution). <code>values_list</code> is a dictionary that is passed to the function and contains the field names as keys and the new/changed information as values. The execute method simply parses the values from each key in the update string which is then sent to the database and table to be changed. Everything is committed and the database is closed.</p>
<p>As this is a &#8220;generic&#8221; function from a &#8220;generic&#8221; class the ideal scenario would be to the function to receive a pre-formatted string with all the information, as in the insert data function, and update the information in the database. </p>
<p>I would like to thank in advance anyone that can comment on this. Next time we will continue checking the generic class and finalize this part in order to start with the interface build process.</p>
<p>Previously in the series:<br />
<a href="http://python.genedrift.org/2009/02/09/managing-a-simple-database-with-python-sqlite-and-wxpython-1/">Part 1</a><br />
<a href="http://python.genedrift.org/2009/02/17/managing-a-simple-database-with-python-sqlite-and-wxpython-2/">Part 2</a><br />
<a href="http://python.genedrift.org/2009/02/18/managing-a-simple-database-with-python-sqlite-and-wxpython-3/">Part 3</a><br />
<a href="http://python.genedrift.org/2009/03/02/managing-a-simple-database-with-python-sqlite-and-wxpython-4/">Part 4</a><br />
<a href="http://python.genedrift.org/2009/03/02/managing-a-simple-database-with-python-sqlite-and-wxpython-5/">Part 5</a><br />
<a href="http://python.genedrift.org/2009/03/31/managing-a-simple-database-with-python-sqlite-and-wxpython-6/">Part 6</a></p>
<div style="margin-top: 10px; height: 15px;" class="zemanta-pixie"><a class="zemanta-pixie-a" href="http://reblog.zemanta.com/zemified/d0bb5d11-6f9d-8521-9a2f-6cd30868e375/" title="Reblog this post [with Zemanta]"><img style="border: medium none ; float: right;" class="zemanta-pixie-img" src="http://img.zemanta.com/reblog_a.png?x-id=d0bb5d11-6f9d-8521-9a2f-6cd30868e375" alt="Reblog this post [with Zemanta]"></a><span class="zem-script more-related pretty-attribution"><script type="text/javascript" src="http://static.zemanta.com/readside/loader.js" defer="defer"></script></span></div>
]]></content:encoded>
			<wfw:commentRss>http://python.genedrift.org/2009/04/20/managing-a-simple-database-with-python-sqlite-and-wxpython-7-includes-a-question/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>Managing a simple database with Python, SQLite and wxPython, 6</title>
		<link>http://python.genedrift.org/2009/03/31/managing-a-simple-database-with-python-sqlite-and-wxpython-6/</link>
		<comments>http://python.genedrift.org/2009/03/31/managing-a-simple-database-with-python-sqlite-and-wxpython-6/#comments</comments>
		<pubDate>Tue, 31 Mar 2009 17:06:08 +0000</pubDate>
		<dc:creator>Paulo Nuin</dc:creator>
				<category><![CDATA[Phase 2]]></category>
		<category><![CDATA[wxPython]]></category>
		<category><![CDATA[bioinformatics]]></category>
		<category><![CDATA[python]]></category>
		<category><![CDATA[SQLite]]></category>

		<guid isPermaLink="false">http://python.genedrift.org/?p=276</guid>
		<description><![CDATA[



Image via Wikipedia



Let&#8217;s get back to our SQLite and wxPython project. We haven&#8217;t seen anything on wxPython yet, and we will check the interface only on the next post. For now, let&#8217;s see some extra code added to the SQLite access class. Remember that we have a generic class and one class derived from it [...]]]></description>
			<content:encoded><![CDATA[<div class="zemanta-img" style="margin: 1em; display: block;">
<div>
<dl style="width: 212px;" class="wp-caption alignright">
<dt class="wp-caption-dt"><a href="http://commons.wikipedia.org/wiki/Image:SQLite_Logo_4.png"><img src="http://upload.wikimedia.org/wikipedia/commons/thumb/1/19/SQLite_Logo_4.png/202px-SQLite_Logo_4.png" alt="The :en:SQLite logo as of 2007-12-15" title="The :en:SQLite logo as of 2007-12-15" height="60" width="202"></a></dt>
<dd class="wp-caption-dd zemanta-img-attribution" style="font-size: 0.8em;">Image via <a href="http://commons.wikipedia.org/wiki/Image:SQLite_Logo_4.png">Wikipedia</a></dd>
</dl>
</div>
</div>
<p>Let&#8217;s get back to our SQLite and wxPython project. We haven&#8217;t seen anything on wxPython yet, and we will check the interface only on the next post. For now, let&#8217;s see some extra code added to the SQLite access class. Remember that we have a generic class and one class derived from it that would work on accessing specific tables in our database file.</p>
<p>When we last covered the db access routines, there was no search for an entry (the function returned everything in the table no matter what), there was no update function in case someone would want to modify an entry and there was no delete method if you wanted to delete something. In the meantime, I added all of this functionality (and some other) to the generic class and extended it to the class derived from it. Let&#8217;s check how the generic class is now (you will notice that there is an issue in one of the methods, if someone can help me I&#8217;d appreciate. More details later.)</p>
<pre name="code" class="python">
class DB_Generic():
    &#039;&#039;&#039;generic class to add DB functionality&#039;&#039;&#039;
    def __init__(self, table_name, db_path = &#039;&#039;):
        #par= name of the table to be used
        self.table_name = table_name
        if len(db_path) &amp;gt; 0:
            self.db_path = db_path
            print db_path

    def get_data_generic(self, range = 1, bac_to_get = 0):
        &#039;&#039;&#039;gets the data from the database&#039;&#039;&#039;       

        if sys.platform == &#039;darwin&#039;:
            (cursor, database) = link_db(self.db_path)
        else:
            (cursor, database) = link_db()

        if range == 1:
            cursor.execute(&quot;&quot;&quot;SELECT * FROM %s&quot;&quot;&quot; % self.table_name)
        elif range == 2:
            cursor.execute(&quot;&quot;&quot;SELECT * FROM %s where idbac = %d&quot;&quot;&quot; % (self.table_name, bac_to_get))

        table_data = cursor.fetchall()
        raw_data = []
        for i in table_data:
            raw_data.append(list(i))

        self.table_data = raw_data
        database.close()

    def insert_data(self, values_list, insert_string):
        &#039;&#039;&#039;inserts data in the database&#039;&#039;&#039;

        if sys.platform == &#039;darwin&#039;:
            (cursor, database) = link_db(self.db_path)
        else:
            (cursor, database) = link_db()

        cursor.execute(insert_string % self.table_name, values_list)

        database.commit()
        database.close()

    def update_data(self, values_list):
        &#039;&#039;&#039;edits and updates fields&#039;&#039;&#039;

        if sys.platform == &#039;darwin&#039;:
            (cursor, database) = link_db(self.db_path)
        else:
            (cursor, database) = link_db()

        #change this to generic!!!!!!!!!!!!
        cursor.execute(&quot;UPDATE bac SET  projects = ?, comments = ?, temperature = ?, cell = ?, box = ?, tubes = ?, chromosome = ?, sdate = ?, clone = ?, source = ?, location1 = ?, startpos = ?, endpos = ?,
		gene = ?, genelink = ?, dnaex = ?, validation = ?, pcr = ?, refs = ?, antibiotic = ? WHERE idbac = ?&quot;,
        (values_list[&#039;projects&#039;], values_list[&#039;comments&#039;], values_list[&#039;temperature&#039;], values_list[&#039;cell&#039;], values_list[&#039;box&#039;], values_list[&#039;tubes&#039;],
         values_list[&#039;chromo&#039;], values_list[&#039;date&#039;], values_list[&#039;clone&#039;], values_list[&#039;source&#039;], values_list[&#039;location&#039;], values_list[&#039;start&#039;], values_list[&#039;end&#039;],
         values_list[&#039;gene&#039;], values_list[&#039;genelink&#039;], values_list[&#039;dna&#039;], values_list[&#039;validation&#039;], values_list[&#039;pcr&#039;],
         values_list[&#039;refs&#039;], values_list[&#039;antibiotic&#039;], values_list[&#039;idbac&#039;]))

        database.commit()
        database.close()

    def delete_data(self, delete_string):
        &#039;&#039;&#039;deletes one field&#039;&#039;&#039;

        if sys.platform == &#039;darwin&#039;:
            (cursor, database) = link_db(self.db_path)
        else:
            (cursor, database) = link_db()
        cursor.execute(delete_string)

        database.commit()
        database.close()
</pre>
<p>In the next couple of posts we&#8217;ll dissect each function and see what&#8217;s going on. The class definition wasn&#8217;t changed, so we start with <code>get_data_generic</code></p>
<pre name="code" class="python">
def get_data_generic(self, range = 1, bac_to_get = 0):
	&#039;&#039;&#039;gets the data from the database&#039;&#039;&#039;       

	if sys.platform == &#039;darwin&#039;:
		(cursor, database) = link_db(self.db_path)
	else:
		(cursor, database) = link_db()

	if range == 1:
		cursor.execute(&quot;&quot;&quot;SELECT * FROM %s&quot;&quot;&quot; % self.table_name)
	elif range == 2:
		cursor.execute(&quot;&quot;&quot;SELECT * FROM %s where idbac = %d&quot;&quot;&quot; % (self.table_name, bac_to_get))

	table_data = cursor.fetchall()
	raw_data = []
	for i in table_data:
		raw_data.append(list(i))

	self.table_data = raw_data
	database.close()
</pre>
<p>The first difference we notice here is the <code>sys.platform</code> usage. This is required if we intend to package our application as an OS X app, using py2app. When a Python/wxPython application is packaged in OS X, the actual application executable is inside the a directory named after the application (or whatever you set up). In our case here we don&#8217;t provide a way for the Python script to receive the path and name for the database on a command line, as we expect it to be in the executable&#8217;s current directory. Because of that we need to provide a &#8220;config&#8221; file (in our case here a one-line text file with the database path) inside the application wrapper, something we will see in the end of the series.</p>
<p>Another modification here is the <code>range</code> parameter and the addition of the <code>bac_to_get</code> parameter. Notice that both parameters have a value assigned to it. This means that they are optional, the function&#8217;s call can pass them or not. If it doesn&#8217;t pass, their value will be the one assigned on the function declaration. So, here if we are interested in getting all bacs, <code>range</code> will have the value of 1 and we don&#8217;t need to worry about it. If we want an specific bac we will pass <code>range</code> as 2 and then pass the <code>bac_to_get</code> ID to be returned. </p>
<p>A final change/addition is that we added a new select statement for the cases when <code>range</code> equals 2. This time we are adding the bac ID to be returned.</p>
<p>Previously in the series:<br />
<a href="http://python.genedrift.org/2009/02/09/managing-a-simple-database-with-python-sqlite-and-wxpython-1/">Part 1</a><br />
<a href="http://python.genedrift.org/2009/02/17/managing-a-simple-database-with-python-sqlite-and-wxpython-2/">Part 2</a><br />
<a href="http://python.genedrift.org/2009/02/18/managing-a-simple-database-with-python-sqlite-and-wxpython-3/">Part 3</a><br />
<a href="http://python.genedrift.org/2009/03/02/managing-a-simple-database-with-python-sqlite-and-wxpython-4/">Part 4</a><br />
<a href="http://python.genedrift.org/2009/03/02/managing-a-simple-database-with-python-sqlite-and-wxpython-5/">Part 5</a></p>
<div style="margin-top: 10px; height: 15px;" class="zemanta-pixie"><a class="zemanta-pixie-a" href="http://reblog.zemanta.com/zemified/ea53b728-33c6-47db-aabf-0c695dcfabd8/" title="Zemified by Zemanta"><img style="border: medium none ; float: right;" class="zemanta-pixie-img" src="http://img.zemanta.com/reblog_a.png?x-id=ea53b728-33c6-47db-aabf-0c695dcfabd8" alt="Reblog this post [with Zemanta]"></a><span class="zem-script more-related"><script type="text/javascript" src="http://static.zemanta.com/readside/loader.js" defer="defer"></script></span></div>
]]></content:encoded>
			<wfw:commentRss>http://python.genedrift.org/2009/03/31/managing-a-simple-database-with-python-sqlite-and-wxpython-6/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>RoR commits</title>
		<link>http://python.genedrift.org/2009/03/15/ror-commits/</link>
		<comments>http://python.genedrift.org/2009/03/15/ror-commits/#comments</comments>
		<pubDate>Sun, 15 Mar 2009 16:59:18 +0000</pubDate>
		<dc:creator>Paulo Nuin</dc:creator>
				<category><![CDATA[Phase 2]]></category>

		<guid isPermaLink="false">http://python.genedrift.org/?p=272</guid>
		<description><![CDATA[Just illustrating my point (or lack of), an animation about the commits of RoR to its repository. Notice the jump after it was migrated to Github
Ruby on Rails from Ilya Grigorik on Vimeo.
Sorry for the non-Python post.
]]></description>
			<content:encoded><![CDATA[<p>Just illustrating my point (or lack of), an animation about the commits of RoR to its repository. Notice the jump after it was migrated to Github</p>
<p><object width="400" height="225"><param name="allowfullscreen" value="true" /><param name="allowscriptaccess" value="always" /><param name="movie" value="http://vimeo.com/moogaloop.swf?clip_id=2979844&amp;server=vimeo.com&amp;show_title=1&amp;show_byline=1&amp;show_portrait=0&amp;color=&amp;fullscreen=1" /><embed src="http://vimeo.com/moogaloop.swf?clip_id=2979844&amp;server=vimeo.com&amp;show_title=1&amp;show_byline=1&amp;show_portrait=0&amp;color=&amp;fullscreen=1" type="application/x-shockwave-flash" allowfullscreen="true" allowscriptaccess="always" width="400" height="225"></embed></object><br /><a href="http://vimeo.com/2979844">Ruby on Rails</a> from <a href="http://vimeo.com/user1211508">Ilya Grigorik</a> on <a href="http://vimeo.com">Vimeo</a>.</p>
<p>Sorry for the non-Python post.</p>
]]></content:encoded>
			<wfw:commentRss>http://python.genedrift.org/2009/03/15/ror-commits/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Managing a simple database with Python, SQLite and wxPython, 5</title>
		<link>http://python.genedrift.org/2009/03/02/managing-a-simple-database-with-python-sqlite-and-wxpython-5/</link>
		<comments>http://python.genedrift.org/2009/03/02/managing-a-simple-database-with-python-sqlite-and-wxpython-5/#comments</comments>
		<pubDate>Tue, 03 Mar 2009 00:23:42 +0000</pubDate>
		<dc:creator>Paulo Nuin</dc:creator>
				<category><![CDATA[Phase 2]]></category>
		<category><![CDATA[wxPython]]></category>
		<category><![CDATA[bioinformatics]]></category>
		<category><![CDATA[Database]]></category>
		<category><![CDATA[python]]></category>
		<category><![CDATA[SQLite]]></category>

		<guid isPermaLink="false">http://python.genedrift.org/?p=253</guid>
		<description><![CDATA[We have seen how to connect, get and insert data (at least theoretically) in the database. Now, a little not about the SQL engine of choice here: SQLite. SQLite databases have the main characteristic that they are self-contained files. Also it does not require an installation, works without a server and works pretty well in [...]]]></description>
			<content:encoded><![CDATA[<p>We have seen how to connect, get and insert data (at least theoretically) in the database. Now, a little not about the SQL engine of choice here: SQLite. SQLite databases have the main characteristic that they are self-contained files. Also it does not require an installation, works without a server and works pretty well in most operating systems. </p>
<p>Basically for the type of application we&#8217;re developing here, SQLite seems ideal. It eliminates a lot of infrastructure that would be needed if we were working with MySQL or postgresql. We don&#8217;t need a server or know how to configure users or manage the databases and tables. All we need is contained in a single file that can be transported from system to system and can be accesed from the computers used in the lab, mainly XP and OS X. Also some web frameworks (Rails and <a class="zem_slink" href="http://www.djangoproject.com" title="Django (web framework)" rel="homepage">Django</a>, for instance) can use SQLite, so in the end we can have a desktop application and a web application accessing the same file without extra configuration.</p>
<p>Now the database created for this application has 8 tables and almost no relationships among them. SQLite allows the creation of relationships but in our case only a couple of cases were required. For the table we are using at the moment (bac) there is no need for relationships, although there are some fileds that can benefit from a more relational structure. Also SQLite don&#8217;t have the same data types that are found on the bigger SQL engines. All values can be stored as text, integer, real (floating point numbers), null and blob (verbose type, what you store is what you get). As actual types, you can set columns as Boolean and Data for instance and SQLite will understand them. If you have no experience in creating databases, let&#8217;s check again the table we are using in this small project. First, I would recommend the use of some SQLite database editor. You can find pretty good ones for any computer system and there is even a Firefox extension that allows you to edit some files. Editors make it easier to generate the SQL table creation scripts and make easier to visualize what we are doing.</p>
<p>So, the table bac looks like </p>
<pre name="code" class="sql">
CREATE TABLE bac
(idbac INTEGER PRIMARY KEY,
clone Text,
sdate Date,
source Text,
gene Text,
chromosome Text,
startpos Integer,
endpos Integer,
antibiotic Text,
location1 Text,
temperature Integer,
tubes Integer,
box Integer,
cell Integer,
dnaex Boolean,
validation Boolean,
pcr Boolean,
projects Text,
comments Text,
genelink Text,
refs Text);
</pre>
<p>If you go back to our last post, you will see that in the insert statement there is no mention of the <code>idbac</code> field. We don&#8217;t actually insert ay value there, the values that populate this field are created automatically. And <code>idbac</code> is our primary key, meaning it&#8217;s the unique identifier of each bac we insert in this table. And in SQLite a integer primary key is automatically incremented whenever values are inserted in the table. So our first insertion will create <code>idbac</code> 1, the second will create <code>idbac</code> 2 and so on. </p>
<p>I&#8217;m not going to enter in details about database development and administration, but it&#8217;s usual and safe to create tables with an auto-incremental integer primary keys. These fields, apart from make it easier t identify records, make access to such records faster and are great when relationships among tables are set. Let&#8217;s say that we had a column user in our bac table. And let&#8217;s say we had an user table with two columns: user_id and name, user_id being a auto-increment primary key. The user column in back could be linked with the user_id column in the user table, in what we call a one-to-many relationship (one user can insert as many bacs as he wants). One day we want to know who is actually working in the lab and we want to check how many bacs were catalogued by each user. We can easily search the user table and extract information from bacs at the same time thanks to the relationship between the tables. And the result should be returned quite quickly, as we are only searching integers.</p>
<p>All the other fields/columns in our table are straightforward to understand. They are basically related to the type of data they need to store. <code>validation</code> is a boolean because the bac might have been validated or not, just as <code>danex</code> (DNA extraction). At the same time, the number of tubes stored in the freezer will always be an integer. So, why does temperature is an integer? Because we can only store bacs in two type of freezers: -80 (ultra freezers) or -20 (regular freezer that we can have at home), and we don&#8217;t need to worry about fractional numbers. </p>
<p>Well, this is a very short and limited explanation of tables and SQLite. The web is full of resources about it, so next time we will get back to Python.</p>
<p>Previously in the series:<br />
<a href="http://python.genedrift.org/2009/02/09/managing-a-simple-database-with-python-sqlite-and-wxpython-1/">Part 1</a><br />
<a href="http://python.genedrift.org/2009/02/17/managing-a-simple-database-with-python-sqlite-and-wxpython-2/">Part 2</a><br />
<a href="http://python.genedrift.org/2009/02/18/managing-a-simple-database-with-python-sqlite-and-wxpython-3/">Part 3</a><br />
<a href="http://python.genedrift.org/2009/03/02/managing-a-simple-database-with-python-sqlite-and-wxpython-4/">Part 4</a></p>
<div style="margin-top: 10px; height: 15px;" class="zemanta-pixie"><a class="zemanta-pixie-a" href="http://reblog.zemanta.com/zemified/4be1389f-5603-4b76-961b-b79d985066cc/" title="Zemified by Zemanta"><img style="border: medium none ; float: right;" class="zemanta-pixie-img" src="http://img.zemanta.com/reblog_e.png?x-id=4be1389f-5603-4b76-961b-b79d985066cc" alt="Reblog this post [with Zemanta]"></a><span class="zem-script more-related"><script type="text/javascript" src="http://static.zemanta.com/readside/loader.js" defer="defer"></script></span></div>
]]></content:encoded>
			<wfw:commentRss>http://python.genedrift.org/2009/03/02/managing-a-simple-database-with-python-sqlite-and-wxpython-5/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Managing a simple database with Python, SQLite and wxPython, 4</title>
		<link>http://python.genedrift.org/2009/03/02/managing-a-simple-database-with-python-sqlite-and-wxpython-4/</link>
		<comments>http://python.genedrift.org/2009/03/02/managing-a-simple-database-with-python-sqlite-and-wxpython-4/#comments</comments>
		<pubDate>Mon, 02 Mar 2009 16:58:09 +0000</pubDate>
		<dc:creator>Paulo Nuin</dc:creator>
				<category><![CDATA[Phase 2]]></category>
		<category><![CDATA[bioinformatics]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[python]]></category>
		<category><![CDATA[SQL]]></category>
		<category><![CDATA[SQLite]]></category>
		<category><![CDATA[wxPython]]></category>

		<guid isPermaLink="false">http://python.genedrift.org/?p=250</guid>
		<description><![CDATA[



Image via Wikipedia



Let&#8217;s continue building our small db app. As mentioned in the previous post we need now to instantiate a specific class from our generic SQLite access class. In order to do this we just have to declare a new class and its type will be DB_Generic.  

class Bac(DB_Generic)

This new class is called [...]]]></description>
			<content:encoded><![CDATA[<div class="zemanta-img" style="margin: 1em; display: block;">
<div>
<dl style="width: 212px;" class="wp-caption alignright">
<dt class="wp-caption-dt"><a href="http://commons.wikipedia.org/wiki/Image:SQLite_Logo_4.png"><img src="http://upload.wikimedia.org/wikipedia/commons/thumb/1/19/SQLite_Logo_4.png/202px-SQLite_Logo_4.png" alt="The :en:SQLite logo as of 2007-12-15" title="The :en:SQLite logo as of 2007-12-15" height="60" width="202"></a></dt>
<dd class="wp-caption-dd zemanta-img-attribution" style="font-size: 0.8em;">Image via <a href="http://commons.wikipedia.org/wiki/Image:SQLite_Logo_4.png">Wikipedia</a></dd>
</dl>
</div>
</div>
<p>Let&#8217;s continue building our small db app. As mentioned in the previous post we need now to instantiate a specific class from our generic <a class="zem_slink" href="http://sqlite.org/" title="SQLite" rel="homepage">SQLite</a> access class. In order to do this we just have to declare a new class and its type will be <code>DB_Generic</code>.  </p>
<pre name="code" class="python">
class Bac(DB_Generic)
</pre>
<p>This new class is called Bac because it&#8217;s linked to the bac table in our database file. A side note, bacs are Bacterial Artificial Chromosomes and are used in different molecular biology techniques. Mainly in our case bacs have incorporated human DNA segments and are used as probes for deletion, duplication, etc studies.</p>
<p>Now, back to our Python code, as soon as we instantiate our generic class, the object (class) we create has access to all methods and functions from the parent class (by using <code>self</code>), but we still need to create functionality and expose other methods that can be accessed from a class object derived from <code>Bac</code>.</p>
<p>Our instantiated class will be </p>
<pre name="code" class="python">
class Bac(DB_Generic):
    def __init_(self):
        self.bac_data = []
        DB_Generic.__init__(self, &#039;bac&#039;)

    def get_data(self):
        return self.get_data_generic()

    def load_data(self):
        pass

    def add_data(self, values_list):
        insert_string = &quot;&quot;&quot;INSERT INTO %s (projects, comments, temperature, cell, box, tubes, chromosome, sdate, clone, source,
        location1, startpos, endpos, gene, genelink, dnaex, validation, pcr, refs, antibiotic)
        VALUES (:projects, :comments, :temperature, :cell, :box, :tubes, :chromo, :date, :clone, :source, :location, :start,
        :end, :gene, :genelink, :dna, :validation, :pcr, :refs, :antibiotic)&quot;&quot;&quot;
        self.insert_data(values_list, insert_string)
</pre>
<p>Pretty simple so far, as we don&#8217;t have a lot of declared methods. Let&#8217;s check one by one</p>
<pre name="code" class="python">
def __init_(self):
    DB_Generic.__init__(self, &#039;bac&#039;)
</pre>
<p>The only line is the initialization required by the parent class, and we&#8217;re passing the value that is the table to be accessed. 	</p>
<pre name="code" class="python">
def get_data(self):
	self.get_data_generic()
	return self.table_data
</pre>
<p>The <code>get_data</code> function returns the all elements in our table (So far, we still don&#8217;t have an elegant range option) and has one too many lines in it. We will get rid of some useless code here in the future, but it&#8217;s OK the way it is. Basically this code access the <code>get_data_generic</code> from the parent class and gets all the values stored in the table. </p>
<p>There is a function not yet complete that will load data, and will be used in the future. And the last one is the function that actually adds the data to the table with a SQL insert statement</p>
<pre name="code" class="python">
def add_data(self, values_list):
	insert_string = &quot;&quot;&quot;INSERT INTO %s (projects, comments, temperature, cell, box, tubes, chromosome, sdate, clone, source,
	location1, startpos, endpos, gene, genelink, dnaex, validation, pcr, refs, antibiotic)
	VALUES (:projects, :comments, :temperature, :cell, :box, :tubes, :chromo, :date, :clone, :source, :location, :start,
	:end, :gene, :genelink, :dna, :validation, :pcr, :refs, :antibiotic)&quot;&quot;&quot;
	self.insert_data(values_list, insert_string)
</pre>
<p>In this function, we have a large string with all the SQL insert options. A SQL insert statement is divided into two parts, one where you point where to insert the values and another where you input the values. Usually simple insert statements will have this structure</p>
<pre name="code" class="sql">
INSERT INTO my_table_name (table_column1, table_column2) VALUES (value1, value2);
</pre>
<p>So, we have the table we want to insert values into, its columns and the values we set for each column. After executed this will put value1 into table_column1 and value2 into table_column2. The actual syntax can vary a bit for different SQL engines but the structure is identical in most cases. Pretty simple.</p>
<p>For our insert string above, there are some aspects to call for attention. Again note the triple quote around the statement. This make sure that it&#8217;s not changed and parsed correctly. We also have a <code>%s</code> for the table name, which will be parsed by the parent class function that insert values, then a list of all the tables in the database and then a list of values to insert. And why the values to be inserted have this <code>:value</code> syntax? Because we are previously storing the values in a dictionary, and the &#8220;:&#8221; indicates that we need to get the dictionary value for the correspondent key.</p>
<p>The insert string, and the list of values (actually a dictionary, not the best variable/object name I must admit) is then sent to the parent class to be inserted. Storing the values to be inserted in a dictionary is OK for a one time insert case, where the values are obtained from a form. If you are parsing a large CSV or TSV file, ideally it&#8217;s better to put it in a list, and dump them at the same time.</p>
<p>We&#8217;re progressing. Next we will take a look on some simple SQL table structure and then move to create the form to insert the values and check the table.</p>
<p>Previously in the series:<br />
<a href="http://python.genedrift.org/2009/02/09/managing-a-simple-database-with-python-sqlite-and-wxpython-1/">Part 1</a><br />
<a href="http://python.genedrift.org/2009/02/17/managing-a-simple-database-with-python-sqlite-and-wxpython-2/">Part 2</a><br />
<a href="http://python.genedrift.org/2009/02/18/managing-a-simple-database-with-python-sqlite-and-wxpython-3/">Part 3</a></p>
<div style="margin-top: 10px; height: 15px;" class="zemanta-pixie"><a class="zemanta-pixie-a" href="http://reblog.zemanta.com/zemified/4c2737a4-f1de-46bf-ae53-c5de040daf97/" title="Zemified by Zemanta"><img style="border: medium none ; float: right;" class="zemanta-pixie-img" src="http://img.zemanta.com/reblog_e.png?x-id=4c2737a4-f1de-46bf-ae53-c5de040daf97" alt="Reblog this post [with Zemanta]"></a><span class="zem-script more-related"><script type="text/javascript" src="http://static.zemanta.com/readside/loader.js" defer="defer"></script></span></div>
<p>evi</p>
]]></content:encoded>
			<wfw:commentRss>http://python.genedrift.org/2009/03/02/managing-a-simple-database-with-python-sqlite-and-wxpython-4/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Why do I blog. Or: Science Blogging, is it worth?</title>
		<link>http://python.genedrift.org/2009/02/21/why-do-i-blog-or-science-blogging-is-it-worth/</link>
		<comments>http://python.genedrift.org/2009/02/21/why-do-i-blog-or-science-blogging-is-it-worth/#comments</comments>
		<pubDate>Sat, 21 Feb 2009 23:18:24 +0000</pubDate>
		<dc:creator>Paulo Nuin</dc:creator>
				<category><![CDATA[Phase 2]]></category>

		<guid isPermaLink="false">http://python.genedrift.org/?p=248</guid>
		<description><![CDATA[



Image via Wikipedia



Mirroring the post that appeared on Blind.Scientist
Some time ago there was a meme about science blogging and one of the questions were &#8220;why do you blog&#8221;. Well, I do it because of the &#8220;Nada Surf effect&#8221;. You don&#8217;t know the &#8220;Nada Surf effect&#8221;? Pity you weren&#8217;t in Washington, DC 2001. 
In March or [...]]]></description>
			<content:encoded><![CDATA[<div class="zemanta-img" style="margin: 1em; display: block;">
<div>
<dl style="width: 212px;" class="wp-caption alignright">
<dt class="wp-caption-dt"><a href="http://en.wikipedia.org/wiki/Image:Nshl.jpg"><img src="http://upload.wikimedia.org/wikipedia/en/thumb/6/61/Nshl.jpg/202px-Nshl.jpg" alt="High/Low album cover" title="High/Low album cover" height="201" width="202"></a></dt>
<dd class="wp-caption-dd zemanta-img-attribution" style="font-size: 0.8em;">Image via <a href="http://en.wikipedia.org/wiki/Image:Nshl.jpg">Wikipedia</a></dd>
</dl>
</div>
</div>
<p><i>Mirroring the post that appeared on Blind.Scientist</i></p>
<p>Some time ago there was a meme about science blogging and one of the questions were &#8220;why do you blog&#8221;. Well, I do it because of the &#8220;<a class="zem_slink" href="http://www.nadasurf.com/" title="Nada Surf" rel="homepage">Nada Surf</a> effect&#8221;. You don&#8217;t know the &#8220;Nada Surf effect&#8221;? Pity you weren&#8217;t in Washington, DC 2001. </p>
<p>In March or April of 2001, Nada Surf played a concert there. It was a small bar on 14th Street W, close to the more famous Black Cat. It was a spring night, I was with a couple of Dutch friends that had told me about the concert, if I&#8217;m not wrong, a couple of days before. It was also mid-week, so you wouldn&#8217;t expect big crowds in most concerts. We left ISH around 7 pm, with spare time for the 9 pm concert. We didn&#8217;t know the venue, we got there and it was empty, just a couple of souls at the bar. We sat and for about an hour we were pondering if we were in the right place, until a guy came and asked if we were staying for the concert. We said yes, paid th US$ 7.50 of the admittance and sipped our beers waiting for the opening act. Soon after we paid, a van parked outside and some guys started bringing music equipment inside. At that time there must have been around 20 people there. The van guys set the instruments, wasted 5 minutes soundchecking, and started. IT was Ashtray Babyhead. </p>
<p>They played for 40 minutes and as fast as they arrived they left. Another van parked outside and this time Nada Surf members started unloading and setting up the stage. Now roadies. OK, maybe one guy helped, but I&#8217;m getting old and the memory sometimes falters. At that point in time, almost 9 pm, the number of brave souls was at 50. They played as they were playing for 50.000 people in Wembley. Nice set, great songs, unforgettable night. After the show, they sold CDs at the usual after-show gathering, we talked about New York, Brazil and feijoada. </p>
<p>And why do I call it the &#8220;Nada Surf effect&#8221;? A band that used to play for thousands of people in festivals and stadiums, had a number one video on MTV (Popular), played in a midweek night in a small bar in Washington, DC as it was the band farewell. Every fan that night felt that they were the most important one, maybe even the only one.</p>
<p>And this type of example is the one that brings me to write this, and to write Beginning Python for Bioinformatics. Especially the latter (as I spend too much time here, writing about non-important stuff). If I can make one person have an idea, one person there to use Python, or least to learn something extra in their lives, I&#8217;m happy. I don&#8217;t care if I have a huge audience or if I&#8217;m famous. I care for the undergrad that is starting today, the high school kid that is hacking at night, or the Java coder that is looking for some better language (Ok, not really, but I couldn&#8217;t resist). And this is one of the things that I learned in Science, always give back and don&#8217;t expect anything in return. Just add to the pile of knowledge.</p>
<p>So, my advice for the three people that read this site is: let Nada Surf in!</p>
<div style="margin-top: 10px; height: 15px;" class="zemanta-pixie"><a class="zemanta-pixie-a" href="http://reblog.zemanta.com/zemified/2db17c83-bb95-4371-911a-0b58b781409b/" title="Zemified by Zemanta"><img style="border: medium none ; float: right;" class="zemanta-pixie-img" src="http://img.zemanta.com/reblog_e.png?x-id=2db17c83-bb95-4371-911a-0b58b781409b" alt="Reblog this post [with Zemanta]"></a></div>
<div class="zemanta-pixie"><img class="zemanta-pixie-img" src="http://img.zemanta.com/pixy.gif?x-id=90407633-45c2-41dc-912f-4b3d57e7eb2b" /></div>
]]></content:encoded>
			<wfw:commentRss>http://python.genedrift.org/2009/02/21/why-do-i-blog-or-science-blogging-is-it-worth/feed/</wfw:commentRss>
		<slash:comments>8</slash:comments>
		</item>
		<item>
		<title>Managing a simple database with Python, SQLite and wxPython, 3</title>
		<link>http://python.genedrift.org/2009/02/18/managing-a-simple-database-with-python-sqlite-and-wxpython-3/</link>
		<comments>http://python.genedrift.org/2009/02/18/managing-a-simple-database-with-python-sqlite-and-wxpython-3/#comments</comments>
		<pubDate>Wed, 18 Feb 2009 16:06:58 +0000</pubDate>
		<dc:creator>Paulo Nuin</dc:creator>
				<category><![CDATA[Phase 2]]></category>
		<category><![CDATA[bioinformatics]]></category>
		<category><![CDATA[Database]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[SQL]]></category>
		<category><![CDATA[SQLite]]></category>

		<guid isPermaLink="false">http://python.genedrift.org/?p=244</guid>
		<description><![CDATA[In the last post we saw how to connect to a SQLite database file and generate a cursor that would allow us to actually interact with such database. Now we need some functionality that will interact with the data, add, read, delete and search. As was mentioned before the idea is to have a generic [...]]]></description>
			<content:encoded><![CDATA[<p>In the last post we saw how to connect to a <a class="zem_slink" href="http://sqlite.org/" title="SQLite" rel="homepage">SQLite</a> database file and generate a cursor that would allow us to actually interact with such database. Now we need some functionality that will interact with the data, add, read, delete and search. As was mentioned before the idea is to have a generic database interaction class and have unique instantiated class objects for each database of the project. In the <code>db_obj.py</code> file we have an initial structure set, so let&#8217;s check the <code>DB_Generic</code> class. </p>
<pre name="code" class="python">
class DB_Generic():
    &#039;&#039;&#039;generic class to add DB functionality&#039;&#039;&#039;
    def __init__(self, table_name):
        #par= name of the table to be used
        self.table_name = table_name

    def delete_entry(self):
        pass

    def get_data_generic(self):
        &#039;&#039;&#039;gets the data from the database
        generic so far, needs to be updated to include range&#039;&#039;&#039;        

        range = 1
        (cursor, database) = link_db()

        if range == 1:
            cursor.execute(&quot;&quot;&quot;SELECT * from %s&quot;&quot;&quot; % self.table_name)

        table_data = cursor.fetchall()
        raw_data = []
        for i in table_data:
            raw_data.append(list(i))

        self.table_data = raw_data

        database.close()

    def insert_data(self, values_list, insert_string):
        &#039;&#039;&#039;inserts data in the database&#039;&#039;&#039;

        (cursor, database) = link_db()
        cursor.execute(insert_string % self.table_name, values_list)

        database.commit()
        database.close()
</pre>
<p>There are different functions in this class, we will take a look at each one individually. We can see that the class is far from being complete, something that we&#8217;ll do in the next posts. We start with the class initialization:</p>
<pre name="code" class="python">
def __init__(self, table_name):
        #par= name of the table to be used
        self.table_name = table_name
</pre>
<p>Very simple and direct, it receives the table name that is being used by the interface (in this case). The table name is then stored in a  object that can be accessed by other functions in the class. The function to delete entries is not finished as we only have a <code>pass</code> in it. We&#8217;ll will do it soon. Next we have a function that gets the data from the table. </p>
<pre name="code" class="python">
    def get_data_generic(self):
        &#039;&#039;&#039;gets the data from the database
        generic so far, needs to be updated to include range&#039;&#039;&#039;        

        range = 1
        (cursor, database) = link_db()

        if range == 1:
            cursor.execute(&quot;&quot;&quot;SELECT * from %s&quot;&quot;&quot; % self.table_name)

        table_data = cursor.fetchall()
        raw_data = []
        for i in table_data:
            raw_data.append(list(i))

        self.table_data = raw_data

        database.close()
</pre>
<p>So far it grabs everything, there is no range selection based on any of the table fields, so it&#8217;s a generic SQL <code>SELECT</code>. Let&#8217;s dissect it. The <code>range</code> object is a dummy variable that at the moment is there only to remind us that we need to include a range select. The next line is the most important in this function: it will call the <code>link_db</code> function and start the connection. Remember that <code>link_db</code> returns a tuple with the cursor and database connection. Basically we will work with cursor methods to get the data, and the first action is to execute a SQL <code>SELECT</code> stetement where we select everything in the table</p>
<pre name="code" class="python">
cursor.execute(&quot;&quot;&quot;SELECT * from %s&quot;&quot;&quot; % self.table_name)
</pre>
<p>Notice that the statement is a regular string and we use string formating <code>%</code> in ordert o add the table that we are searching, which was defined when we initialized the class object in the first place. Also, notice the triple quotes around the select statement: this will avoid any problems in parsing it when sending to the database, making it a <a class="zem_slink" href="http://en.wikipedia.org/wiki/String_literal" title="String literal" rel="wikipedia">string literal</a>.</p>
<p>So this line executes the statement we pass to the method, but it does not actually get the data <i>per se</i>. We need to use another method and fetch everything returned by the select. This is done by </p>
<pre name="code" class="python">
table_data = cursor.fetchall()
</pre>
<p>A couple of things here. The data fetched will be always (or in most cases) in unicode, when it&#8217;s a string field. And the data returned will be in a list of tuples, with the actual number of fields from the table. We know that it&#8217;s easier to work with lists than tuples, so we code something to convert types</p>
<pre name="code" class="python">
table_data = cursor.fetchall()
raw_data = []
for i in table_data:
    raw_data.append(list(i))

self.table_data = raw_data
</pre>
<p>There are extra lines here that are not needed, and we will get rid of them in a code refactoring soon. This short function is able to connect to database, execute a SQL statement on a specified table and grab the data selected, returning a list of lists with every field and value available. We need to add a better selection statement later, and we will do as soon as we have a good structure set.</p>
<p>The last function in this generic class is the one that inserts data into the table.</p>
<pre name="code" class="python">
def insert_data(self, values_list, insert_string):
    &#039;&#039;&#039;inserts data in the database&#039;&#039;&#039;

    cursor, database) = link_db()

    cursor.execute(insert_string % self.table_name, values_list)

    database.commit()
    database.close()
</pre>
<p>Identical procedure: connect, get a cursor, execute a statement. But in this case the extra step is not to get the data, but to commit the data to the table, which is done by the commit method. We will explain later how the execute method works here and what are the <code>insert_string</code> and <code>values_list</code>. Notice at the end that we need to close the connection to the database, so we know that the data has been inserted properly.</p>
<p>Next, we will instantiate a class from this generic one and see how easy is to manipulate the data. Stay tuned.</p>
<p>Previously in the series:<br />
<a href="http://python.genedrift.org/2009/02/09/managing-a-simple-database-with-python-sqlite-and-wxpython-1/">Part 1</a><br />
<a href="http://python.genedrift.org/2009/02/17/managing-a-simple-database-with-python-sqlite-and-wxpython-2/">Part 2</a></p>
<div style="margin-top: 10px; height: 15px;" class="zemanta-pixie"><a class="zemanta-pixie-a" href="http://reblog.zemanta.com/zemified/1663854e-5aba-4ff1-9a75-dffb8e6b7945/" title="Zemified by Zemanta"><img style="border: medium none ; float: right;" class="zemanta-pixie-img" src="http://img.zemanta.com/reblog_e.png?x-id=1663854e-5aba-4ff1-9a75-dffb8e6b7945" alt="Reblog this post [with Zemanta]"></a><span class="zem-script more-related"><script type="text/javascript" src="http://static.zemanta.com/readside/loader.js" defer="defer"></script></span></div>
]]></content:encoded>
			<wfw:commentRss>http://python.genedrift.org/2009/02/18/managing-a-simple-database-with-python-sqlite-and-wxpython-3/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Managing a simple database with Python, SQLite and wxPython, 2</title>
		<link>http://python.genedrift.org/2009/02/17/managing-a-simple-database-with-python-sqlite-and-wxpython-2/</link>
		<comments>http://python.genedrift.org/2009/02/17/managing-a-simple-database-with-python-sqlite-and-wxpython-2/#comments</comments>
		<pubDate>Tue, 17 Feb 2009 16:04:18 +0000</pubDate>
		<dc:creator>Paulo Nuin</dc:creator>
				<category><![CDATA[Phase 2]]></category>
		<category><![CDATA[bioinformatics]]></category>
		<category><![CDATA[Database]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[python]]></category>
		<category><![CDATA[SQL]]></category>
		<category><![CDATA[SQLite]]></category>

		<guid isPermaLink="false">http://python.genedrift.org/?p=242</guid>
		<description><![CDATA[Let&#8217;s continue coding our small Python + SQLite application. The initial idea was to have a file for the interface and another file for the DB access. We will start with the later. If you have access to the repository you will see two Python files, bac_form.py and db_obj.py. At the moment they are not [...]]]></description>
			<content:encoded><![CDATA[<p>Let&#8217;s continue coding our small Python + SQLite application. The initial idea was to have a file for the interface and another file for the DB access. We will start with the later. If you have access to the repository you will see two Python files, <code>bac_form.py</code> and <code>db_obj.py</code>. At the moment they are not well commented and have some junk lines at the bottom, legacy from older versions. Take a look on <code>db_obj.py</code>.</p>
<p>It has two class declarations, one called <code>DB_Generic</code> and another one called <code>Bac</code>. Remember in the last post where I mentioned that the idea was to have different simple tables in the same SQLite database and each table would have a simple input/output interface (If I didn&#8217;t mention that, I just did!). So, we can create a generic DB access class and we can subtype from it for every table that we will be using. In the <code>db_obj.py</code> file we have at the moment the generic database management class, a class derived from the generic to access the Bac database and an initialization function, that opens the access to the SQLite file. Let&#8217;s take a look at it:</p>
<pre name="code" class="python">
def link_db():
    &#039;&#039;&#039;initializes the database file&#039;&#039;&#039;
    try:
        db = sqlite3.connect(&quot;samples.db&quot;)
    except sqlite3.Error, errmsg:
        print &#039;DB not available &#039; + str(errmsg)
        sys.exit()

    db_cursor = db.cursor()
    return db_cursor, db
</pre>
<p>In order to access a SQLite database file we need initially the name of the file. After importing sqlite3 (we&#8217;re using the latest version of SQLite here) Python has everything it needs to access, change and manipulate data in a SQLite database. Just to be sure the database file is there and we don&#8217;t get an error, we have the initialization code inside an exception. We have seen exceptions before and in this case we use it to be sure the database file has been accessed with no problems. The exception structure looks like</p>
<pre name="code" class="python">
try:
	#here we try to do something
	#the code placed here would be executed
	#if no error reported it will go until the end and exit
	#if not, some error (exception) raised
except:
	#the code under except will be executed
</pre>
<p>So, the first step is to connect to the database file</p>
<pre name="code" class="python">
db = sqlite3.connect(&quot;samples.db&quot;)
</pre>
<p>In our case it&#8217;s a fixed file, but the connect method receives any kind of string. It could have been some parameter obtained from the command line or a string from a configuration file. If the connect is successful, no error will be raised and we are safe to continue. We connected to database, now what? We need a cursor, an object that will actually access the data and allow us to execute <a class="zem_slink" href="http://en.wikipedia.org/wiki/SQL" title="SQL" rel="wikipedia">SQL</a> commands on it.</p>
<pre name="code" class="python">
db_cursor = db.cursor()
</pre>
<p><code>cursor</code> method works on the database connection object that we created previously. We&#8217;re set. This function returns the cursor and database connection objects that we created, in a tuple and this function can be called from the classes we are going to work. The classes will then have connection to the database and a cursor that would manage, select, delete and add data to it.</p>
<p>Next time we&#8217;ll see how our generic table class works.</p>
<p>Previously in the series:<br />
<a href="http://python.genedrift.org/2009/02/09/managing-a-simple-database-with-python-sqlite-and-wxpython-1/">Part 1</a></p>
<div style="margin-top: 10px; height: 15px;" class="zemanta-pixie"><a class="zemanta-pixie-a" href="http://reblog.zemanta.com/zemified/cbbab23e-338a-4756-8f4c-25cc0a5239dc/" title="Zemified by Zemanta"><img style="border: medium none ; float: right;" class="zemanta-pixie-img" src="http://img.zemanta.com/reblog_e.png?x-id=cbbab23e-338a-4756-8f4c-25cc0a5239dc" alt="Reblog this post [with Zemanta]"></a><span class="zem-script more-related"><script type="text/javascript" src="http://static.zemanta.com/readside/loader.js" defer="defer"></script></span></div>
]]></content:encoded>
			<wfw:commentRss>http://python.genedrift.org/2009/02/17/managing-a-simple-database-with-python-sqlite-and-wxpython-2/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Managing a simple database with Python, SQLite and wxPython, 1</title>
		<link>http://python.genedrift.org/2009/02/09/managing-a-simple-database-with-python-sqlite-and-wxpython-1/</link>
		<comments>http://python.genedrift.org/2009/02/09/managing-a-simple-database-with-python-sqlite-and-wxpython-1/#comments</comments>
		<pubDate>Mon, 09 Feb 2009 20:11:57 +0000</pubDate>
		<dc:creator>Paulo Nuin</dc:creator>
				<category><![CDATA[Phase 2]]></category>
		<category><![CDATA[Graphical user interface]]></category>
		<category><![CDATA[python]]></category>
		<category><![CDATA[SQLite]]></category>

		<guid isPermaLink="false">http://python.genedrift.org/?p=238</guid>
		<description><![CDATA[



Image via Wikipedia



A little break from reviewing the book, let&#8217;s check some database topics in Python. I was asked to create a simple database to organize wet-lab stuff. No relationships needs, no relational tables required. Just a simple table with determined columns, and a nice GUI to go with it so people can edit, search [...]]]></description>
			<content:encoded><![CDATA[<div class="zemanta-img" style="margin: 1em; display: block;">
<div>
<dl style="width: 212px;" class="wp-caption alignright">
<dt class="wp-caption-dt"><a href="http://commons.wikipedia.org/wiki/Image:WxPython-logo.png"><img src="http://upload.wikimedia.org/wikipedia/commons/thumb/c/c0/WxPython-logo.png/202px-WxPython-logo.png" alt="The official wxPython logo" title="The official wxPython logo" height="126" width="202"></a></dt>
<dd class="wp-caption-dd zemanta-img-attribution" style="font-size: 0.8em;">Image via <a href="http://commons.wikipedia.org/wiki/Image:WxPython-logo.png">Wikipedia</a></dd>
</dl>
</div>
</div>
<p>A little break from reviewing the book, let&#8217;s check some database topics in Python. I was asked to create a simple database to organize wet-lab stuff. No relationships needs, no relational tables required. Just a simple table with determined columns, and a nice GUI to go with it so people can edit, search and use.</p>
<p>My first idea was to use SQLite database, and I stuck with it. After the initial phase of &#8220;interviews&#8221; to check database requirements, I ended up with a list of tables and decided to start working on the table that organizes the BACs used in the lab. BAC is a DNA vector into which large DNA fragments can be inserted and cloned in a bacterial host, and are used mainly in cytogenetics around here. In the end the table had this structure</p>
<pre name="code" class="sql">
CREATE TABLE bac
(idbac INTEGER PRIMARY KEY,
clone Text,
sdate Date,
source Text,
gene TEXT,
chromosome Text,
startpos Integer,
endpos Integer,
antibiotic Text,
location1 Text,
temperature Integer,
tubes Integer,
box Integer,
cell Integer,
dnaex Boolean,
validation Boolean,
pcr Boolean,
projects Text,
comments Text,
genelink Text,
refs Text);
</pre>
<p>I won&#8217;t explain in detail each of the fields, but we can see that there is a mix of different types. SQLite doesn&#8217;t allow many different field types, so we stick to the basics. </p>
<p>And why SQLite? The module to access it comes with Python 2.5, the whole database is stored in one file that can be moved around and it allows a full SQL query language, which is perfect for these simple cases. So we will going to use Python, SQLite and wxPython to create a simple application to manage our simple database.</p>
<p>I already created a GitHub <a href="http://github.com/nuin/python_sqlite_wxpython/tree/master">repository</a> for the upcoming code.</p>
<div style="margin-top: 10px; height: 15px;" class="zemanta-pixie"><a class="zemanta-pixie-a" href="http://reblog.zemanta.com/zemified/1a2359cd-d9c7-4cee-a26b-c41b281ec3d2/" title="Zemified by Zemanta"><img style="border: medium none ; float: right;" class="zemanta-pixie-img" src="http://img.zemanta.com/reblog_e.png?x-id=1a2359cd-d9c7-4cee-a26b-c41b281ec3d2" alt="Reblog this post [with Zemanta]"></a></div>
]]></content:encoded>
			<wfw:commentRss>http://python.genedrift.org/2009/02/09/managing-a-simple-database-with-python-sqlite-and-wxpython-1/feed/</wfw:commentRss>
		<slash:comments>17</slash:comments>
		</item>
		<item>
		<title>BPforB is now PEP 8 compliant!</title>
		<link>http://python.genedrift.org/2009/01/31/bpforb-is-now-pep-8-compliant/</link>
		<comments>http://python.genedrift.org/2009/01/31/bpforb-is-now-pep-8-compliant/#comments</comments>
		<pubDate>Sat, 31 Jan 2009 17:51:10 +0000</pubDate>
		<dc:creator>Paulo Nuin</dc:creator>
				<category><![CDATA[Phase 2]]></category>

		<guid isPermaLink="false">http://python.genedrift.org/?p=222</guid>
		<description><![CDATA[As mentioned in the previous post, Robin Stocker kindly provided a git patch with the required changes to all scripts stored on the repository to be compliant with the PEP 8.
The changes were mainly regarding variable/object names, but they were important as make the code available here more Pythonic following the rules of the Benevolent [...]]]></description>
			<content:encoded><![CDATA[<p>As mentioned in the previous post, Robin Stocker kindly provided a git patch with the required changes to all scripts stored on the repository to be compliant with the PEP 8.</p>
<p>The changes were mainly regarding variable/object names, but they were important as make the code available here more Pythonic following the rules of the Benevolent Dictator for Life.</p>
<p>I would like to thank Robin for spending his time doing this. Much appreciated.</p>
<p>Now, just a quick git tutorial on how to apply patches:</p>
<p>git apply __patch_file__<br />
git commit -a -m &#8220;patch applied&#8221;<br />
git push</p>
<p>That&#8217;s it. Apply, commit, push and you&#8217;re done. The repository is already updated.</p>
]]></content:encoded>
			<wfw:commentRss>http://python.genedrift.org/2009/01/31/bpforb-is-now-pep-8-compliant/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Finally it&#8217;s 2009 &#8230;</title>
		<link>http://python.genedrift.org/2009/01/31/finally-its-2009/</link>
		<comments>http://python.genedrift.org/2009/01/31/finally-its-2009/#comments</comments>
		<pubDate>Sat, 31 Jan 2009 17:30:45 +0000</pubDate>
		<dc:creator>Paulo Nuin</dc:creator>
				<category><![CDATA[Phase 2]]></category>

		<guid isPermaLink="false">http://python.genedrift.org/?p=220</guid>
		<description><![CDATA[



Image via Wikipedia



And &#8230; we&#8217;re back. The long and cold winter is still out there and January 2009 is almost in the books. After a long period without updating I&#8217;ll try to &#8220;rush&#8221; some posts this week, trying to get back on track. So, a little bit of what&#8217;s up and coming:
- a patch provided [...]]]></description>
			<content:encoded><![CDATA[<div class="zemanta-img" style="margin: 1em; display: block;">
<div>
<dl style="width: 212px;" class="wp-caption alignright">
<dt class="wp-caption-dt"><a href="http://en.wikipedia.org/wiki/Image:PythonProgLogo.png"><img src="http://upload.wikimedia.org/wikipedia/en/thumb/2/25/PythonProgLogo.png/202px-PythonProgLogo.png" alt="Python logo, 1990s-2005" title="Python logo, 1990s-2005" width="202" height="56"></a></dt>
<dd class="wp-caption-dd zemanta-img-attribution" style="font-size: 0.8em;">Image via <a href="http://en.wikipedia.org/wiki/Image:PythonProgLogo.png">Wikipedia</a></dd>
</dl>
</div>
</div>
<p>And &#8230; we&#8217;re back. The long and cold winter is still out there and January 2009 is almost in the books. After a long period without updating I&#8217;ll try to &#8220;rush&#8221; some posts this week, trying to get back on track. So, a little bit of what&#8217;s up and coming:</p>
<p>- a patch provided by Robin Stocker to make all scripts published here (at least the ones on GitHub) <a href="http://www.python.org/dev/peps/pep-0008/">PEP 8</a> compliant.</p>
<p>- using SQLite databases in Python</p>
<p>- developing an interface to access the database</p>
<p>- anything that you might suggest, just leave a comment.</p>
<p>Let&#8217;s start 2009 then.</p>
<div style="margin-top: 10px; height: 15px;" class="zemanta-pixie"><a class="zemanta-pixie-a" href="http://reblog.zemanta.com/zemified/3e82d2d2-7738-437f-80c3-9ac14da72974/" title="Zemified by Zemanta"><img style="border: medium none ; float: right;" class="zemanta-pixie-img" src="http://img.zemanta.com/reblog_e.png?x-id=3e82d2d2-7738-437f-80c3-9ac14da72974" alt="Reblog this post [with Zemanta]"></a></div>
]]></content:encoded>
			<wfw:commentRss>http://python.genedrift.org/2009/01/31/finally-its-2009/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Twitter</title>
		<link>http://python.genedrift.org/2009/01/30/twitter/</link>
		<comments>http://python.genedrift.org/2009/01/30/twitter/#comments</comments>
		<pubDate>Sat, 31 Jan 2009 00:47:48 +0000</pubDate>
		<dc:creator>Paulo Nuin</dc:creator>
				<category><![CDATA[Phase 2]]></category>

		<guid isPermaLink="false">http://python.genedrift.org/?p=218</guid>
		<description><![CDATA[I&#8217;m on Twitter, for quite some time. Some Python stuff, some biology, some bioinformatics, and a little bit of everything else.
nuin.
]]></description>
			<content:encoded><![CDATA[<p>I&#8217;m on Twitter, for quite some time. Some Python stuff, some biology, some bioinformatics, and a little bit of everything else.</p>
<p><a href="http://twitter.com/nuin">nuin</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://python.genedrift.org/2009/01/30/twitter/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>That&#8217;s it for 2008</title>
		<link>http://python.genedrift.org/2008/12/19/thats-it-for-2008/</link>
		<comments>http://python.genedrift.org/2008/12/19/thats-it-for-2008/#comments</comments>
		<pubDate>Fri, 19 Dec 2008 17:34:15 +0000</pubDate>
		<dc:creator>Paulo Nuin</dc:creator>
				<category><![CDATA[Phase 2]]></category>

		<guid isPermaLink="false">http://python.genedrift.org/2008/12/19/thats-it-for-2008/</guid>
		<description><![CDATA[The date came and is now gone, and I forgot to &#8220;celebrate&#8221; two years of Beginning Python for Bioinformatics on December 13th. I would like to thank everyone that commented, helped with posts and suggested anything that would make this website better. Clearly it is far from being what I wanted it to be, but [...]]]></description>
			<content:encoded><![CDATA[<p>The date came and is now gone, and I forgot to &#8220;celebrate&#8221; two years of Beginning Python for Bioinformatics on December 13th. I would like to thank everyone that commented, helped with posts and suggested anything that would make this website better. Clearly it is far from being what I wanted it to be, but slowly but surely we will get there.</p>
<p>Thanks again and I wish an excellent holiday season and a great 2009 to everyone!</p>
<p>See you in 2009.</p>
]]></content:encoded>
			<wfw:commentRss>http://python.genedrift.org/2008/12/19/thats-it-for-2008/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Git repository updated</title>
		<link>http://python.genedrift.org/2008/09/12/git-repository-updated/</link>
		<comments>http://python.genedrift.org/2008/09/12/git-repository-updated/#comments</comments>
		<pubDate>Sat, 13 Sep 2008 02:48:55 +0000</pubDate>
		<dc:creator>Paulo Nuin</dc:creator>
				<category><![CDATA[Phase 2]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[python]]></category>

		<guid isPermaLink="false">http://python.genedrift.org/?p=168</guid>
		<description><![CDATA[Image via Wikipedia I just updated the git repository of BPB. Click here to access it. Most of the code presented in the blog is there, some with extra comments, some being updated. 
This close another phase in the blog and soon we will check some different aspects of Python programming in Bioinformatics.

]]></description>
			<content:encoded><![CDATA[<p><span class="zemanta-img" style="margin: 1em; float: right; display: block;"><a href="http://commons.wikipedia.org/wiki/Image:Commercial_st.jpg"><img src="http://upload.wikimedia.org/wikipedia/commons/thumb/a/ae/Commercial_st.jpg/202px-Commercial_st.jpg" alt="Commercial Street is an important commercial a..." style="border: medium none ; display: block;"></a><span class="zemanta-img-attribution" style="margin: 1em 0pt 0pt; display: block;">Image via <a href="http://commons.wikipedia.org/wiki/Image:Commercial_st.jpg">Wikipedia</a> </span></span>I just updated the git repository of BPB. Click <a href="http://github.com/nuin/beginning-python-for-bioinformatics/tree/master">here</a> to access it. Most of the code presented in the blog is there, some with extra comments, some being updated. </p>
<p>This close another phase in the blog and soon we will check some different aspects of <a href="http://en.wikipedia.org/wiki/Python_%28programming_language%29" title="Python (programming language)" rel="wikipedia" class="zem_slink">Python programming</a> in Bioinformatics.</p>
<div style="margin-top: 10px; height: 15px;" class="zemanta-pixie"><a class="zemanta-pixie-a" href="http://reblog.zemanta.com/zemified/f20b7cb8-167b-4e9c-a0fd-6e64f97f170c/" title="Zemified by Zemanta"><img style="border: medium none ; float: right;" class="zemanta-pixie-img" src="http://img.zemanta.com/reblog_e.png?x-id=f20b7cb8-167b-4e9c-a0fd-6e64f97f170c" alt="Reblog this post [with Zemanta]"></a></div>
]]></content:encoded>
			<wfw:commentRss>http://python.genedrift.org/2008/09/12/git-repository-updated/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Python, overepresented motifs, the Grand Finale</title>
		<link>http://python.genedrift.org/2008/09/05/python-overepresented-motifs-the-grand-finale/</link>
		<comments>http://python.genedrift.org/2008/09/05/python-overepresented-motifs-the-grand-finale/#comments</comments>
		<pubDate>Sat, 06 Sep 2008 02:27:56 +0000</pubDate>
		<dc:creator>Paulo Nuin</dc:creator>
				<category><![CDATA[Phase 2]]></category>
		<category><![CDATA[motifs]]></category>
		<category><![CDATA[bioinformatics]]></category>
		<category><![CDATA[determination]]></category>
		<category><![CDATA[Motif]]></category>
		<category><![CDATA[python]]></category>

		<guid isPermaLink="false">http://python.genedrift.org/?p=163</guid>
		<description><![CDATA[In this final part, let&#8217;s do some very simple refactoring and modify the output section to make the result a little bit better. There are not many options about the functions to calculate the binomial expansion. But Andrew posted some opinions on how to slight change the quorum function.

def get_quorums(seqs, mlen):
    &#34;&#34;&#34;
 [...]]]></description>
			<content:encoded><![CDATA[<p>In this final part, let&#8217;s do some very simple refactoring and modify the output section to make the result a little bit better. There are not many options about the functions to calculate the <a href="http://en.wikipedia.org/wiki/Binomial_theorem" title="Binomial theorem" rel="wikipedia" class="zem_slink">binomial expansion</a>. But Andrew posted some opinions on how to slight change the quorum function.</p>
<pre name="code" class="python">
def get_quorums(seqs, mlen):
    &quot;&quot;&quot;
    add seq id_no to a set
    use explicit counter to create seq_no
    &quot;&quot;&quot;
    quorum = defaultdict(int)
    for seq in seqs:
        for n in range(len(seq) - mlen):
            quorum[seq[n:n + mlen]] += 1
    return quorum
</pre>
<p>His modifications were small but improved the code a bit, as you remove one variable/object from the function. At the same time there is need to change a bit our output section of the code, as we don&#8217;t use a <code>defaultdict</code> initialized with a set, but with an integer.</p>
<pre name="code" class="python">
for i in foreground:
    term1 = choose(background[i], foreground[i])
    term2 = choose((N - background[i]), len(input_seqs)-1)
    term3 = choose(N, len(input_seqs))
    p = (float(term1) * float(term2)) / term3
    if 0 &lt; p &lt;= 0.0001:
        print i, foreground[i], background[i], p
</pre>
<p>Notice that in the <code>term1</code> line we don&#8217;t check for the set length anymore and just use the integer stored in <code>foreground</code> and <code>background</code>. Again a small change, that can make the code a little bit more clear. But we need to modify this section so the output is a little bit more clear, maybe ordered by motif sequence.</p>
<p>But as we are reading the sequences as they are our results are not ordered. It would be great to have a final list starting with AAAAAAAA and ending with TTTTTTTTT. There is an easy way to do that, and very inexpensive regarding code and final performance. Basically we append each one of the motifs (and their extra information) to a list and use the <code>sort</code> method for lists. So our output section of the code will be</p>
<pre name="code" class="python">
res_motifs = []
for i in foreground:
    term1 = choose(background[i], foreground[i])
    term2 = choose((N - background[i]), len(input_seqs)-1)
    term3 = choose(N, len(input_seqs))
    p = (float(term1) * float(term2)) / term3
    if 0 &lt; p &lt;= 0.0001:
        res_motifs.append(i + &#039;\t&#039; + str(foreground[i]) + &#039;\t&#039; + str(background[i]) + &#039;\t&#039; + str(p))

res_motifs.sort()
for i in res_motifs:
    print i
</pre>
<p>Putting everything together our final motif determination script is (batteries included):</p>
<pre name="code" class="python">
#!/usr/bin/env python

import fasta
import sys
from collections import defaultdict

def choose(n, k):
    if 0 &lt;= k &lt;= n:
        ntok = 1
        ktok = 1
        for t in xrange(1, min(k, n - k) + 1):
            ntok *= n
            ktok *= t
            n -= 1
        return ntok // ktok
    else:
        return 0

def get_quorums(seqs, mlen):
    &quot;&quot;&quot;
    add seq id_no to a set
    use explicit counter to create seq_no
    &quot;&quot;&quot;
    quorum = defaultdict(int)
    for seq in seqs:
        for n in range(len(seq) - mlen):
            quorum[seq[n:n + mlen]] += 1
    return quorum

input_seqs = fasta.read_seqs(open(sys.argv[1]).readlines())
input_seqs2 = fasta.read_seqs(open(sys.argv[2]).readlines())

foreground = get_quorums(input_seqs, 10)
background = get_quorums(input_seqs2, 10)

N = len(input_seqs) + len(input_seqs2)

res_motifs = []
for i in foreground:
    term1 = choose(background[i], len(foreground[i])
    term2 = choose((N - background[i]), len(input_seqs)-1)
    term3 = choose(N, len(input_seqs))
    p = (float(term1) * float(term2)) / term3
    if 0 &lt; p &lt;= 0.0001:
        res_motifs.append(i + &#039;\t&#039; + str(foreground[i]) + &#039;\t&#039; + str(background[i]) + &#039;\t&#039; + str(p))

res_motifs.sort()
for i in res_motifs:
    print i
</pre>
<p>Next we will see some basic Python methods. And maybe start a new series and phase.</p>
<div style="margin-top: 10px; height: 15px;" class="zemanta-pixie"><a class="zemanta-pixie-a" href="http://reblog.zemanta.com/zemified/9a6ebaca-cd31-40e8-bb9f-df57424745a9/" title="Zemified by Zemanta"><img style="border: medium none ; float: right;" class="zemanta-pixie-img" src="http://img.zemanta.com/reblog_e.png?x-id=9a6ebaca-cd31-40e8-bb9f-df57424745a9" alt="Reblog this post [with Zemanta]"></a></div>
]]></content:encoded>
			<wfw:commentRss>http://python.genedrift.org/2008/09/05/python-overepresented-motifs-the-grand-finale/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Obtaining overrepresented motifs in DNA sequences, final</title>
		<link>http://python.genedrift.org/2008/09/03/obtaining-overrepresented-motifs-in-dna-sequences-final/</link>
		<comments>http://python.genedrift.org/2008/09/03/obtaining-overrepresented-motifs-in-dna-sequences-final/#comments</comments>
		<pubDate>Thu, 04 Sep 2008 01:21:38 +0000</pubDate>
		<dc:creator>Paulo Nuin</dc:creator>
				<category><![CDATA[Phase 2]]></category>
		<category><![CDATA[bioinformatics]]></category>
		<category><![CDATA[motifs]]></category>
		<category><![CDATA[Repository]]></category>

		<guid isPermaLink="false">http://python.genedrift.org/?p=158</guid>
		<description><![CDATA[The part 13 of the motifs series is the last one. In a couple of weeks I will post a refactored code, including the suggestions from Andrew in the last post. I will update the blog contents on OWW and commit some of the code to the GitHub repository.

]]></description>
			<content:encoded><![CDATA[<p>The part 13 of the motifs series is the last one. In a couple of weeks I will post a refactored code, including the suggestions from Andrew in the last post. I will update the blog contents on OWW and commit some of the code to the GitHub <a href="http://en.wikipedia.org/wiki/Repository" title="Repository" rel="wikipedia" class="zem_slink">repository</a>.</p>
<div style="margin-top: 10px; height: 15px;" class="zemanta-pixie"><a class="zemanta-pixie-a" href="http://reblog.zemanta.com/zemified/78ceeaec-2c7a-49ce-affd-5fba56b779ef/" title="Zemified by Zemanta"><img style="border: medium none ; float: right;" class="zemanta-pixie-img" src="http://img.zemanta.com/reblog_e.png?x-id=78ceeaec-2c7a-49ce-affd-5fba56b779ef" alt="Reblog this post [with Zemanta]"></a></div>
]]></content:encoded>
			<wfw:commentRss>http://python.genedrift.org/2008/09/03/obtaining-overrepresented-motifs-in-dna-sequences-final/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>

