Restriction enzymes: second take
Section 6 September 7th, 2007We already have a function that reads the enzymes from a dataset in a flat file (with one change: return)
def read_enzymes(file):
resenz = {}
start = False
for line in file:
if line.find('Rich Roberts') >= 0:
start = True
line = file.next()
if start == True and len(line) > 10:</pre>
buffer = line.split()
resenz[buffer[0]] = buffer[-1].replace('^', '')
return resenz
We now need a function to write a function that searches for the sites and a main function that accepts the parameters, coordinate the search and return the results. Looks like we are more than halfway there.
Parameters input was shown before, starting on section 3. We import the sys module and use the array inside sys.argv to send the parameters to the script. A basic skeleton of our main function would look like this
import sys
import re
import fasta
#reading the ezyme dataset in one line and storing
#enzyme information in enzymeset
enzymeset = read_enzymes(open('bionet.709', 'r'))
#storing enzyme name on a string
enzyme = sys.argv[1]
#reading a FASTA file and sotring the sequences
sequence = fasta.get_seqs(open(sys.argv[2], 'r').readlines())
That’s a start. Now we have to write a function that will check for the enzyme name entered by the user in order to check for the existence of such enzyme. Something like this would work
def check_enzyme(input, set):
if set.has_key(input):
return True
else:
return False
This basically tests of the dictionary contains the name entered. If yes then we return True, otherwise False is returned. This changes our main script body
import sys
import re
import fasta
#reading the ezyme dataset in one line and storing
#enzyme information in enzymeset
enzymeset = read_enzymes(open('bionet.709', 'r'))
#storing enzyme name on a string
enzyme = sys.argv[1]
#check if the name entered is valid
isname = check_enzyme(enzyme, enzymeset)
#if it is valid, continue, otherwise abort
if isname:
#reading a FASTA file and sotring the sequences
sequence = fasta.get_seqs(open(sys.argv[2], 'r').readlines())</pre>
else:
print 'Name invalid. Please try again.'
So, we have a good idea on what to do now. We just need a function that creates a regular expression and searches it on the sequence. Next time …
September 28th, 2007 at 9:50 am
[...] to get rid of them. We would be surprised if it was difficult, and as expected it is not. We used here the replace method and we can use it here too. We only need to modify the line that concatenates [...]