Scripts and Python 3.0, part 2, using 2to3
Python 3.0 December 8th, 2008And we’re back to check our initial scripts to run on Python 3.0. Along with this latest release, a nice tool to parse your scripts is also installed. It’s called 2to3 and it’s available in the Tools/scripts of your Python 3.o installation directory. Basic usage is very similar to any python script:
[sourcecode language='bash']2to3 and the output is similar to a diff output, with the lines that should be changed in the original form and in Python 3.0 form. Codes 01, 02, 03 and 04 were trivial to change, mainly the "issue" was on the print statement. The final codes 03 and 04 are below, after them we will see code_05. One extra thing: on code_04 we used to print the sequence in different lines (basically outputting what was read from the file), and if we change the print statement from to it will add an extra line between the sequence lines. The same will happen if we change the line to So what we need to do is to add a parameter to the print function, That's it. So code_03 and code_04 will be Now, code_05, has 45 lines including comments, so it should be a good idea to test 2to3 on it, especially after a long time since we created the script. There might be some other changes that we might miss (I already . Let's run 2to3 on our original script and check the output: So a lot of lines, seems that we need to do a lot of changes. But in fact there are not many changes. All lines starting with a - need to be removed and replaced by the lines starting with a +, and the + lines are adjacent to the ones we need to change. Again, the most common changes here are the print statements, but there is also another change to be done So, there is no Digging a little bit into 2to3 we see that it can write the code for use by using the 2to3 seems to be pretty good in detecting changes, pointing them to you and even writing the newer script for you. Until now I haven't tested on big scripts (more than 100 lines long), but I plan to do it soon. For small scripts we saw that it works quite well. A good test for 2to3 would be when we get to the motifs scripts that are a little bit more complex, even though they are quite short. Stay tuned and check new code on the repository.
print line,
print(line,)
print(line)
end. This will tell Python what we want the end of the line, in our case an empty string. So the line would be
print(line, end = '')
#code_03
import re
#setting the DNA string
myDNA = 'ACGTTGCAACGTTGCAACGTTGCA'
#assigning a new regex and compiling it
#to find all Ts
regexp = re.compile('T')
#create a new string tha will receive
#the regex result with Us replacing Ts
myRNA = regexp.sub('U', myDNA)
print(myRNA)
#end 03
#code 04
#assigning a filename to a variable
dnafile = "AY162388.seq"
#opening the file
file = open(dnafile, 'r')
#printing each line of the file
for line in file:
print(line, end="")
RefactoringTool: Skipping implicit fixer: buffer
RefactoringTool: Skipping implicit fixer: idioms
RefactoringTool: Skipping implicit fixer: set_literal
RefactoringTool: Skipping implicit fixer: ws_comma
--- code_05.py (original)
+++ code_05.py (refactored)
@@ -30,16 +30,16 @@
#the loop continues
while inputfromuser:
#raw_input received the user input as string
- inmotif = raw_input('Enter motif to search: ')
+ inmotif = input('Enter motif to search: ')
#now we check for the size of the input
if len(inmotif) >= 1:
#we compile a regex with the input given
motif = re.compile('%s' % inmotif)
#looking to see if the entered motif is in the sequence
if re.search(motif, sequence):
- print 'Yep, I found it'
+ print('Yep, I found it')
else:
- print 'Sorry, try another one'
+ print('Sorry, try another one')
else:
- print 'Done, thanks for using motif_search'
+ print('Done, thanks for using motif_search')
inputfromuser = False
RefactoringTool: Files that need to be modified:
RefactoringTool: code_05.py
inmotif = raw_input('Enter motif to search: ')
inmotif = input('Enter motif to search: ')
raw_input in Python 3.0, it was abolished for a more evolved input function that now always expect a string, that can be evaluated later if needed (and desired), just like the old raw_input. Now there is no confusion anymore on which one to use.-w parameter when running the script. Be careful as it rewrites the same file, however saving a backup copy. In the end we get this code from 2to3 (code_05.py)
#!/usr/bin/env python
'''
simple script to find motifs on DNA sequences using regex
the script is interactive
'''
# we use the RegEx module
import re
import string
#still keep the file fixed
dnafile = "AY162388.seq"
#opening the file, reading the sequence and storing in a list
seqlist = open(dnafile, 'r').readlines()
#let's join the the lines in a temporary string
temp = ''.join(seqlist)
#assigning our sequence, with no carriage returns to our
#final variable/object
sequence = temp.replace('\n', '')
#we start to deal with user input
#first we use a boolean variable to check for valid input
inputfromuser = True
#while loop: while there is an motif larger than 0
#the loop continues
while inputfromuser:
#raw_input received the user input as string
inmotif = input('Enter motif to search: ')
#now we check for the size of the input
if len(inmotif) >= 1:
#we compile a regex with the input given
motif = re.compile('%s' % inmotif)
#looking to see if the entered motif is in the sequence
if re.search(motif, sequence):
print('Yep, I found it')
else:
print('Sorry, try another one')
else:
print('Done, thanks for using motif_search')
inputfromuser = False
December 9th, 2008 at 4:29 pm
Do you know about PEP 8? It’s a style guide for Python programs, which most of the Python programmers adhere to for consistency. For example, variable names should be like_this and not likeThis (my_dna or just dna instead of myDNA). Here’s the whole document, I recommend to read it and follow it if you want to better fit in the Python world
http://www.python.org/dev/peps/pep-0008/
December 9th, 2008 at 5:27 pm
Thanks a lot for the suggestion! BTW, these scripts were posted almost two years ago and were just a comparison to some Perl scripts from Beg. Perl for Bioinfo. I guess maybe 3 or 4 scripts have variables with this notation, but you can always help me out and improve my code checking the scripts on the repository.
December 9th, 2008 at 10:17 pm
I’m glad 2to3 is working for you! If you do find any bugs (and I’m sure there are many), please report them to http://bugs.python.org.
December 10th, 2008 at 11:53 am
Will do. I’m planning to check most of my scripts and wxPython stuff later this week and post here.