<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Beginning Python for Bioinformatics &#187; Python 3.0</title>
	<atom:link href="http://python.genedrift.org/category/python-30/feed/" rel="self" type="application/rss+xml" />
	<link>http://python.genedrift.org</link>
	<description>a step-by-step guide to create Python applications in bioinformatics</description>
	<lastBuildDate>Thu, 20 May 2010 21:34:41 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.1-alpha</generator>
		<item>
		<title>Scripts and Python 3.0, part 2, using 2to3</title>
		<link>http://python.genedrift.org/2008/12/08/scripts-and-python-30-part-2-using-2to3/</link>
		<comments>http://python.genedrift.org/2008/12/08/scripts-and-python-30-part-2-using-2to3/#comments</comments>
		<pubDate>Mon, 08 Dec 2008 19:55:56 +0000</pubDate>
		<dc:creator>Paulo Nuin</dc:creator>
				<category><![CDATA[Python 3.0]]></category>

		<guid isPermaLink="false">http://python.genedrift.org/2008/12/08/scripts-and-python-30-part-2-using-2to3/</guid>
		<description><![CDATA[And we&#8217;re back to check our initial scripts to run on Python 3.0. Along with this latest release, a nice tool to parse your scripts is also installed. It&#8217;s called 2to3 and it&#8217;s available in the Tools/scripts of your Python 3.o installation directory. Basic usage is very similar to any python script: [sourcecode language='bash']2to3 = [...]]]></description>
			<content:encoded><![CDATA[<p>And we&#8217;re back to check our initial scripts to run on Python 3.0. Along with this latest release, a nice tool to parse your scripts is also installed. It&#8217;s called <a href="http://docs.python.org/dev/3.0/library/2to3.html" target="_blank">2to3</a> and it&#8217;s available in the <code>Tools/scripts</code> of your Python 3.o installation directory. Basic usage is very similar to any python script:</p>
<p>[sourcecode language='bash']2to3 <name of the script[/sourcecode]</p>
<p>and the output is similar to a diff output, with the lines that should be changed in the original form and in Python 3.0 form. Codes 01, 02, 03 and 04 were trivial to change, mainly the "issue" was on the print statement. The final codes 03 and 04 are below, after them we will see code_05. One extra thing: on code_04 we used to print the sequence in different lines (basically outputting what was read from the file), and if we change the print statement from</p>
<pre name="code" class="python">
print line,
</pre>
<p>to</p>
<pre name="code" class="python">
print(line,)
</pre>
<p>it will add an extra line between the sequence lines. The same will happen if we change the line to</p>
<pre name="code" class="python">
print(line)
</pre>
<p>So what we need to do is to add a parameter to the print function, <code>end</code>. This will tell Python what we want the end of the line, in our case an empty string. So the line would be</p>
<pre name="code" class="python">
print(line, end = &#039;&#039;)
</pre>
<p>That's it. So code_03 and code_04 will be</p>
<pre name="code" class="python">
#code_03
import re

#setting the DNA string
myDNA = &#039;ACGTTGCAACGTTGCAACGTTGCA&#039;

#assigning a new regex and compiling it
#to find all Ts
regexp = re.compile(&#039;T&#039;)

#create a new string tha will receive
#the regex result with Us replacing Ts
myRNA = regexp.sub(&#039;U&#039;, myDNA)

print(myRNA)
#end 03

#code 04
#assigning a filename to a variable
dnafile = &quot;AY162388.seq&quot;

#opening the file
file = open(dnafile, &#039;r&#039;)

#printing each line of the file
for line in file:
    print(line, end=&quot;&quot;)
</pre>
<p>Now, code_05, has 45 lines including comments, so it should be a good idea to test 2to3 on it, especially after a long time since we created the script. There might be some other changes that we might miss (I already . Let's run 2to3 on our original script and check the output:</p>
<p><code>RefactoringTool: Skipping implicit fixer: buffer<br />
RefactoringTool: Skipping implicit fixer: idioms<br />
RefactoringTool: Skipping implicit fixer: set_literal<br />
RefactoringTool: Skipping implicit fixer: ws_comma<br />
--- code_05.py (original)<br />
+++ code_05.py (refactored)<br />
@@ -30,16 +30,16 @@<br />
 #the loop continues<br />
 while inputfromuser:<br />
     #raw_input received the user input as string<br />
-    inmotif = raw_input('Enter motif to search: ')<br />
+    inmotif = input('Enter motif to search: ')<br />
     #now we check for the size of the input<br />
     if len(inmotif) >= 1:<br />
         #we compile a regex with the input given<br />
         motif = re.compile('%s' % inmotif)<br />
         #looking to see if the entered motif is in the sequence<br />
         if re.search(motif, sequence):<br />
-            print 'Yep, I found it'<br />
+            print('Yep, I found it')<br />
         else:<br />
-            print 'Sorry, try another one'<br />
+            print('Sorry, try another one')<br />
     else:<br />
-        print 'Done, thanks for using motif_search'<br />
+        print('Done, thanks for using motif_search')<br />
         inputfromuser = False<br />
RefactoringTool: Files that need to be modified:<br />
RefactoringTool: code_05.py</code></p>
<p>So a lot of lines, seems that we need to do a lot of changes. But in fact there are not many changes. All lines starting with a - need to be removed and replaced by the lines starting with a +, and the + lines are adjacent to the ones we need to change. Again, the most common changes here are the print statements, but there is also another change to be done</p>
<pre name="code" class="python">
inmotif = raw_input(&#039;Enter motif to search: &#039;)
inmotif = input(&#039;Enter motif to search: &#039;)
</pre>
<p>So, there is no <code>raw_input</code> in Python 3.0, it was abolished for a more evolved <code>input</code> function that now always expect a string, that can be evaluated later if needed (and desired), just like the old <code>raw_input</code>. Now there is no confusion anymore on which one to use.</p>
<p>Digging a little bit into 2to3 we see that it can write the code for use by using the <code>-w</code> parameter when running the script. Be careful as it rewrites the same file, however saving a backup copy. In the end we get this code from 2to3 (code_05.py)</p>
<pre name="code" class="python">
#!/usr/bin/env python

&#039;&#039;&#039;
simple script to find motifs on DNA sequences using regex
the script is interactive
&#039;&#039;&#039;

# we use the RegEx module
import re
import string

#still keep the file fixed
dnafile = &quot;AY162388.seq&quot;

#opening the file, reading the sequence and storing in a list
seqlist = open(dnafile, &#039;r&#039;).readlines()

#let&#039;s join the the lines in a temporary string
temp = &#039;&#039;.join(seqlist)

#assigning our sequence, with no carriage returns to our
#final variable/object
sequence = temp.replace(&#039;\n&#039;, &#039;&#039;)

#we start to deal with user input
#first we use a boolean variable to check for valid input
inputfromuser = True

#while loop: while there is an motif larger than 0
#the loop continues
while inputfromuser:
    #raw_input received the user input as string
    inmotif = input(&#039;Enter motif to search: &#039;)
    #now we check for the size of the input
    if len(inmotif) &gt;= 1:
        #we compile a regex with the input given
        motif = re.compile(&#039;%s&#039; % inmotif)
        #looking to see if the entered motif is in the sequence
        if re.search(motif, sequence):
            print(&#039;Yep, I found it&#039;)
        else:
            print(&#039;Sorry, try another one&#039;)
    else:
        print(&#039;Done, thanks for using motif_search&#039;)
        inputfromuser = False
</pre>
<p>2to3 seems to be pretty good in detecting changes, pointing them to you and even writing the newer script for you. Until now I haven't tested on big scripts (more than 100 lines long), but I plan to do it soon. For small scripts we saw that it works quite well. A good test for 2to3 would be when we get to the motifs scripts that are a little bit more complex, even though they are quite short. Stay tuned and check new code on the repository.</p>
]]></content:encoded>
			<wfw:commentRss>http://python.genedrift.org/2008/12/08/scripts-and-python-30-part-2-using-2to3/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Scripts and Python 3.0, part 1</title>
		<link>http://python.genedrift.org/2008/12/06/scripts-and-python-30-part-1/</link>
		<comments>http://python.genedrift.org/2008/12/06/scripts-and-python-30-part-1/#comments</comments>
		<pubDate>Sat, 06 Dec 2008 16:07:08 +0000</pubDate>
		<dc:creator>Paulo Nuin</dc:creator>
				<category><![CDATA[Python 3.0]]></category>
		<category><![CDATA[print]]></category>
		<category><![CDATA[python2.x]]></category>
		<category><![CDATA[python3.0]]></category>

		<guid isPermaLink="false">http://python.genedrift.org/2008/12/06/scripts-and-python-30-part-1/</guid>
		<description><![CDATA[Yes, Python 3.0 was released earlier than Perl &#8230; what version was it? 6? 7? Anyway, I decided to go back to most of the scripts that were posted here. In the github repo we have 50 files in the &#8220;original scripts&#8221; directory. Let&#8217;s check how do they fare on Python 3.0 and what type [...]]]></description>
			<content:encoded><![CDATA[<p>Yes, Python 3.0 was released earlier than Perl &#8230; what version was it? 6? 7? Anyway, I decided to go back to most of the scripts that were posted here. In the github repo we have 50 files in the &#8220;original scripts&#8221; directory. Let&#8217;s check how do they fare on Python 3.0 and what type of changes we need to do in order to make them work. Starting with code_01.py, which is a couple of lines long</p>
<pre name="code" class="python">
myDNA = &quot;ACGTACGTACGTACGTACGTACGT&quot;
print myDNA
</pre>
<p>Here we have one of the most evident differences between Python 2.x and 3.0. Now <code>print</code> is a <a href="http://docs.python.org/dev/3.0/whatsnew/3.0.html#print-is-a-function" target="_blank">function not a statement anymore</a>, so whatever we want to print now should be passed as a function parameter. The above code would be changed to </p>
<pre name="code" class="python">
myDNA = &quot;ACGTACGTACGTACGTACGTACGT&quot;
print(myDNA)
</pre>
<p>That simple ins this case. But what are the advantages of <code>print</code> being a function over a statement? More flexibility, as can be seen in the link above. It is possible now to send different parameters to print and make the output richer by customizing separators between items, directing the output, etc.</p>
<p>A similar change would have to be made n our code_02.py. There are two <code>print</code> statements there that should be translated to the function. Trivial, so far. The original code</p>
<pre name="code" class="python">
myDNA = &quot;ACGTACGTACGTACGTACGTACGT&quot;
myDNA2 = &quot;TCGATCGATCGATCGATCGA&quot;
print &quot;First and Second sequences&quot;
print myDNA, myDNA2
myDNA3 = myDNA + myDNA2
print &quot;Concatenated sequence&quot;
print myDNA3
</pre>
<p>and to work on Python 3</p>
<pre name="code" class="python">
myDNA = &quot;ACGTACGTACGTACGTACGTACGT&quot;
myDNA2 = &quot;TCGATCGATCGATCGATCGA&quot;
print(&quot;First and Second sequences&quot;)
print(myDNA, myDNA2)
myDNA3 = myDNA + myDNA2
print(&quot;Concatenated sequence&quot;)
print(myDNA3)
</pre>
<p>This is would be the biggest (or at least the most common) change that we would need to make in the scripts posted here. Follow the repo to get the newer versions.</p>
]]></content:encoded>
			<wfw:commentRss>http://python.genedrift.org/2008/12/06/scripts-and-python-30-part-1/feed/</wfw:commentRss>
		<slash:comments>8</slash:comments>
		</item>
	</channel>
</rss>

