Let’s improve our previous script and put the contents of the file in a variable similar to an array. Python understands different formats of compound data types, and list is the most versatile. A list in Python can be assigned by a series of elements (or values) separated by a comma and surrounded by square brackets

shoplist = ['milk', 1, 'lettuce', 2, 'coffee', 3]

Now, we are going to read the same file and store the DNA sequence in a list and output this variable. The beginning of the script is the same, where we basically tell Python that the file name is AY162388.seq.

#! /usr/bin/env python
dnafile = "AY162388.seq"

We are going to change the way we read the file. Instead of just opening and then reading line-by-line, we are going to open it a read all the lines at once, by using this

file = open(dnafile, 'r').readlines()

Notice the part in bold? In the previous script, we open and store the contents of the file in a file object. Now , we are opening the file and just after it is opened, we are reading all the lines of the file at once and storing them in file<code>file object, but is a Python’s list of strings.

Before, if we wanted to manipulate our DNA sequence, we would had to read it, and then in the loop store in a variable of our choice. In this script, we do that all at once, and the result is a variable that we can change the way we wanted. The code without the output part is

#! /usr/bin/env python
dnafile = "AY162388.seq"
file = open(dnafile, 'r').readlines()

Try putting a print statement after the last line to print the file list. You will get something like this

['GTGACTT...TTGTTC\n', 'TTTAAATA....TAATC\n', 'AGTGA...CTATG\n', 'GAGCTCA....TATAGC\n', ...]

which is exactly the description of a Python’s list. You see all lines, separated by comma and surrounded by square brackets. Notice that each line has a carriage return (\n&lt;\code&gt;) symbol at the end.

Let’s make the output a little nicer including a loop. Remember when I introduced loop I wrote that Python iterates over “items in a sequence of items”, what is a good synonym for list. So the loop should be as straightforward as

for line in file:
    print line</pre>
Putting everything together gives us
<pre lang="python">#! /usr/bin/env python
dnafile = "AY162388.seq"
file = open(dnafile, 'r').readlines()
print file
for line in file:
    print line

that can be downloaded here. Next we will work on improving the output again and maybe modify/convert the list.