Python for Bioinformatics by Sebastian Bassi: a (short) review

Uncategorized 1 Comment »

I promised some time ago to post a complete review of Python for Bioinformatics (Chapman & Hall/CRC Mathematical & Computational Biology) by Sebastian Bassi. It’s long overdue, but the delay allowed me to get more acquainted to the book and its contents.

I can only say that I highly recommend this book, especially for the biologist that is beginning in bioinformatics or python (or both). I cannot compare it to any other Python and Bioinformatics books (I’m planning to buy the another one), but I can say that I could learn a thing or two from Sebastian’s book. Evidently is not a perfect book, as some of the explanations are a little bit rushed and might be difficult for a beginner. At the same time this is a very carefully thought and planned book and has more than enough for one to learn Python and apply the language to solve biological problems. I really liked the BioPython section, and this section made me use BioPython for the first time. Some of BioPython’s examples in the book are light years ahead of the examples in the tool’s website.

Lastly, I would like to congratulate Sebastian for his work and effort in putting together a nice tome for Python and Bioinformatics. It’s a valuable resource for everyone in the field and certainly will help spread Python in the community.

Biostar: bioinformatics community

Uncategorized 1 Comment »

Biostar is a bioinformatics community on the StackExchange network. It’s still small and not a lot of questions are asked and answered every day, so we need more people participating. If you are new to bioinformatics, or are just curious about the newest trends in the field, help us grow.

The real value of blogging

Uncategorized 3 Comments »

A couple of days ago I posted on here an entry called ‘The “sickest” Python code I’ve ever created‘. It’s a code that does some file management for proteomics data, with a different set of inputs each time you run it.

The “sickest” part of the title is that it was a small challenge to me. I’ve been away of actual hard-core coding for quite sometime, and you lose some of the gist of the thing with time. Mostly, nowadays, I make simple scripts that don’t require any kind of advanced skills (in any language) and I don’t worry that much of releasing code or about ultra fast performance. I knew from the time I posted that a lot of people would jump and help and teach me, as I was aware it wasn’t the most elegant code out there, not even the most Pythonic one too. What also helped me is that my Python/Bioinformatics blog is indexed on Planet Python, and the audience is far more hard-core Python that I ever dreamed of getting by myself alone.

But the real deal is that I believe it would be much more difficult for me to get some positive feedback or even an answer if I had posted bits of my code on a online forum or community or list. Every time I used one of these methods, I either got no answer, or got schooled for not posting in the right format or somebody replied that no one knew how to do it. There’s the real deal of blogging, and the value is even higher if your audience knows more than you do. I appreciate every comment I got on that post and on others too, I learned things that I wasn’t able to learn from computer books and online tutorials (yes, I searched sometimes before reading the comments, and sometimes after).

Python Testing Beginner’s Guide, review

Uncategorized Comments Off

I posted about a week ago that Packt Publishing had invited me to review Python Testing Beginner’s Guide by Daniel Arbuckle. Having finished reading the book (I must admit that I haven’t tried all the code in it), I can say that I have an excellent initial impression of the book.

PTBG is not a long book and the topic is divided in 10 chapters and one appendix. One of the first things that I liked about the book is that there’s no introduction (or something similar) to Python. It just goes straight to the point assuming that you have some good understanding of the language and everything that surrounds it. In the past I was frustrated with some “Introduction to X with Python” that wasted precious space talking over and over about a topic, learning Python, better covered in many other books. PTBG does not waste time and space introducing its main topic which is testing, and in my opinion that’s the best approach, even though it might look a little bit abrupt by some.

The language and text in the book is clear and very pleasant. PTBG is a very well written book and I really enjoyed its style. The first chapters of the book cover Python testing using doctests. For someone like me that didn’t write so many tests in the normal software development workflow (I know I should write more tests), this section seems like a really nice introduction to the topic, with well thought real-life like examples and a good flow on the explanation of the different features. One small complain that I have is that for a beginner sometimes the code listed in the examples might seem a little bit confusing, and maybe the addition of line numbers might have helped a bit here. But at the same I understand that this is normal style of some Packt books.

After the doctests section, PTBG gets into more advanced techniques, covering a little bit mock objects with Mocker, then moving into unittest and nose. The latter is a Python tool that allows for managing, running and automating tests. Also covered is Twill, another third-party library that allows for testing of web applications.

One full chapter is devoted to test-driven development, with a complete walkthrough of this approach. This gives a wrap-up of most of the techniques and modules covered in the book, but there’s still space for another chapter that shows how beautifully doctests, unittest and nose can be fully integrated and help the development of applications using the test-driven approach.

Overall, I really enjoyed PTBG. As I mentioned, test driven development was never a high priority in the application I usually developed with Python. But certainly this book can be a good starting point for some Python test beginners to incorporate these techniques in their usual development workflow. Scientific software is also a perfect niche for this type of approach and we should do what is possible in order to avoid the nightmares of the past.

Bioinformatics career survey

Uncategorized Comments Off

Via Bioinformatics Zen:

Zemanta Pixie

Repository

Uncategorized Comments Off

As mentioned on the last post I am moving the current repository, an html page, to an actual Git repository on github.com. The link to the repository is

http://github.com/nuin/beginning-python-for-bioinformatics/tree/master

and it can be accessed by anyone. There are only a handful of scripts there but I am slowly adding more comments to the scripts and moving them to github. The web interface at github is pretty nice and the code can be viewed on the website with a nice code highlighting, for example. Also there is an RSS feed to receive updates, commits, etc.

How to create a local copy

Git is very easy to use and it is very simple to create a local copy of the repository on your local machine. Git is available on most systems as a command-line utility (there is a gui but I haven’t used yet) and to have an updated copy of the Beginning Python for Bioinformatics, two commands are needed.

- first you have to clone the repository

$ git clone git://github.com/nuin/beginning-python-for-bioinformatics.git

that will create a beginning-python-for-bioinformatics directory wherever you run git (this only needs to be done once)

and to keep the clone updated

$ git pull

from inside the clone directory.

Any questions please let me know.

Python and AppEngine in Vista

Uncategorized 1 Comment »

I had problems installing Google’s AppEngine in Vista. I had Python 2.5.1 installed in my machine but every time I tried to install the msi package it failed, claiming that Python was not present, even though C:\Python25 was in the path. AppEngine issues site did not help much either, the “solution” listed there was to make sure Python was in the path.

So, I decided to start over. I removed Python (and ActiveState Python, which I installed before to see if AppEngine would work) and re-installed it, or tried to. Strangely, Python’s msi package was installing it in the C drive root, not under Python25. For half an hour I tested all possible combinations, versions and tricks to have it installed in the proper directory/folder. Then I remembered msiexec, a command line tool that runs msi packages. Running msiexec with no parameters shows a window with options, but basically /i is the the only required. /i tells msiexec the input package, while ‘code>/qb make the installation quiet and with a basic interface. msiexec worked flawlessly and then Python was in its right place.

msiexec /i python-2.5.2.msi TARGETDIR="C:\Python25" /qb

Then the same "trick" was used with AppEngine, but without a TARGETDIR. Bingo, it installed perfectly (I am assuming that).

Now, I just have to wait for my AppEngine account.

Test II

Uncategorized Comments Off
import sys
for i in file:
    print i

Benchmarking Python: fastest way to generate unique lists

Phase 2, Uncategorized 2 Comments »

Just for fun, let’s see if there is any advantage (apart from generating a smaller code) in using either of the approaches to create an unique list. A list of 741 gene IDs and another one with 1322 (that contained all the 741 IDs from the first) were used. Instead of hard coding the lists in the script, normal I/O was used and the files were read automatically. Using a for loop the scripts were run 10 times and the final time averaged.

Here are the results

Using dictionaries 0.08540
Using sets 0.15890

Almost twice as fast for the dictionaries. Why? One thing, and one thing only: Python version. My system’s box default Python is 2.3.4, which has one of the first implementations of set. I have a “personal” Python version 2.5 so the test was redone with the newer Python

Using dictionaries 0.04520
Using sets 0.03770

Ok. That shows a little bit of advantage for sets what is expected. But it shows us that different versions of Python have a huge difference between them. This is a rather crude and simple test, with a small list of entries, but there is a huge gain from version 2.3 to versio 2.5, either in dicts or sets.

For a more comprehensive test and more functions with a similar objective check this page.

Python Magazine

Uncategorized Comments Off

This is a blog devoted to basic Python, but I could not refrain myself of suggesting the new Python Magazine, that has its first issue free online.

Enjoy.

Design by j david macor.com.Original WP Theme & Icons by N.Design Studio
Entries RSS Comments RSS Log in