BioPython and CVS
off topic March 13th, 2009
- Image via Wikipedia
I start this post with an apology. I usually don’t rant or vent here, which are feelings that I usually reserve to my personal blog.
I don’t use BioPython, never used it. I have it installed in my systems, but I never wrote a piece of code importing BioPython routines. But I subscribe to their mailing lists, both user and developer. I maybe have written once to the list, and I just follow the discussions there.
Since last year one of the main topics has been the possibility of moving BioPython from CVS to another version control system. Yes, you read it right. It’s 2009 and BioPython uses CVS and their version control system. Soon, CVS will be like typewriters and LPs to young developers. Last stable release of CVS was sometime in 2005, what in interwebs time is equivalent to something like 1972. Since 2005, Subversion has taken the world of version control by storm, and Git is getting also very strong, not to mention Bazaar, Darcs, Mercurial and some others that I might not be aware of.
This is a discussion that have been dragging for sometime in the list. And it’s a shame, a clear lack of leadership from whoever is (not) leading the project. BioRuby is Git, BioPerl SVN and BioPython is CVS, because they “need to care for the legacy developers”. It’s like MSFT keeping two copies of the Notepad executable because they needed to cater to legacy applications, but with a different scale of course. With the current Python steam in the non-bioinformatics and bioinformatics community is very sad to see BioPython not evolving (before you ask me, no, I’m not interested in helping, not the way things are now). Perl which is language forever-in-waiting for its holy grail (Perl 6) has a strong community behind it, and more important an excellent leadership, that’s not scare of making decisions.
So, if you’re still using CVS, it’s 2009!
![Reblog this post [with Zemanta]](http://img.zemanta.com/reblog_e.png?x-id=06117f0a-6d25-497d-b686-1a96e8a7d81c)
March 13th, 2009 at 3:07 pm
Paulo,
Thanks for that post. I’m also on both lists, watching the discussion about moving to the DVCS, but apparently it got stuck in a point where nobody wants to take responsibility for moving to Git or Bazaar. I would like to contribute some code, and not that it’s not possible with CVS, it’s just not so flexible in case where I want to use it only once in a while for my changes and mostly see what others are coding/debugging. Github would be perfect. I’m following both repos there, hopefully one of them will gain an official status.
March 13th, 2009 at 3:35 pm
I think the worst excuse was to “support legacy developers”.
March 13th, 2009 at 5:00 pm
What blows my mind is PHP and Drupal both still use CVS. They’re some of the most popular technologies on the web!
Luckily I don’t use either project personally, so it doesn’t bother me.
March 13th, 2009 at 7:39 pm
I think it’s sane to wait for a technology/software to become robust and well supported before moving large projects, but there’s always a cutoff point where you have to evolve and change and make the life of developers easier. It’s also sane to avoid the hype, but at least SVN is beyond the hype phase, and Git/Github is getting closer to.
At the same time that you want to support legacy developers, you’re not allowing new developers to join your project. What is easier: make the legacy developers learn a new and better technology or to attract new blood?
March 13th, 2009 at 8:34 pm
Are they having any trouble with CVS? Does CVS prevent people from contributing in a way that Subversion would not?
I doubt that the answer to either of these is a strong “yes”… so why should they change!? Change for change’s sake is silly.
I think there are good arguments for moving to a DVCS, for example, but absent a strong desire to move to a DVCS, arguing that CVS is just too *old* strikes me as being besides the point.
(Not that I really disagree – I think there are compelling arguments for moving to svn because it’s less *broken* but that’s not the argument you’re making here!)
–titus
March 13th, 2009 at 10:24 pm
Titus, I heard of people complaining about the current state and not being able to contribute to the project. Don’t know if the answer to your questions would be a strong yes, but I guess the answers are yes.
Re: CVS being old is not the point I tried to make. Old is not necessarily bad, but in some aspects of life (software for instance) old can be bad. The point is that there are better options out there, and BioPython can benefit from them. Have you ever watched the RoR commit movie? Do you see what happens when it was ported to Github?
March 14th, 2009 at 6:46 pm
Titus Brown said:
“Are they having any trouble with CVS? Does CVS prevent people from contributing in a way that Subversion would not?”
Yes. There was trouble managing the conversion over to NumPy because of the CVS track-only-by-file-not-by-changeset restriction. There’s also the massive benefit of merges made simple thanks to the modern distributed systems.
Then there’s the restriction that nobody can commit unless they are a developer. In a DSCM you can’t commit to the trunk unless you have access, but you can still develop off the trunk on your own branch; then, the main developers can merge in your changes and the provenance of your contribution is retained in the merge. Right now, if you’re not a developer, you have to submit a patch via bugzilla, that a main developer can apply and commit. Your efforts aren’t noted in the code base.
Also, with sites like GitHub or Launchpad, outsiders can see who’s working on what feature, thanks to the fork/branch tracking. Even the main developers don’t necessarily remember who’s responsible for what.
This is not about “the new hotness”. This is about benefits that outweigh costs.
March 14th, 2009 at 9:06 pm
Chris,
Err, yes, I understand that there are real reasons to change. (As you know I use svn, git, darcs, github, and code.google.com myself.) Most of your response focuses on the utility of DVCS over svn, which is a different discussion from whether or not to move from CVS to svn. But Paulo didn’t discuss any of that in his post — he just complained about how old CVS is and how BioPython isn’t “with it” because they’re using an old VCS.
I’m not defending staying with CVS, merely pointing out that it’s worth making a real argument…
–titus
March 14th, 2009 at 9:48 pm
The opening line of the post was an apology and a suggestion that this is/was a rant. I could spend the whole day making and long post of the arguments on why to change from CVS to whatever. I decided to stay with the rant, and point out how CVS is old, not the “good old”, but the “obsolete old”. My points were that CVS is detrimental to the development of BioPython because it’s obsolete and that there is a complete lack of leadership in the project.
As we say in Portuguese, to state that SVS/Git/Darcs/Mercurial/etc are better than CVS would be to rain on a wet patch. There are many sites and blog, written by cleverer people showing all the reasons one should migrate his/her project to one of these VCS.
March 14th, 2009 at 10:19 pm
First, I fail to see why you are so upset, as you yourself say you do not use Biopython and have no stake in the project.
Second, Biopython has no dedicated professional developers, it is not funded by any agency, and it is run by volunteers who have their day jobs. Therefore the priorities are a bit different, also the pace. Some of us go under for a long time due to day jobs (I’ll probably never return to contribute full steam as I did when I was a grad student and a beginning postdoc). Staying on CVS for a couple of more years past its last release would not kill Biopython. Other issues such as maintaining compatibility with changing packages (NumPy), removing legacy code that was good for its time but no more (Mindy/Martel) and preparing for 2.6 (and 3.0) were deemed more important. Also, adding necessary features to the sequence object, adding trace data compatibilty, moving to a different testing environment, all took time and leadership decisions. Under all these constraints, Biopython is still second only to bioperl in its user-base size (judging by list traffic, active participants, and journal citations).
“Perl which is language forever-in-waiting for its holy grail (Perl 6) has a strong community behind it, and more important an excellent leadership, that’s not scare of making decisions”
Not sure what is the point you are making by that. What has the management of Perl (the language) have to do with anything? I don’t think you are seriously trying to compare a project of the magnitude of Perl to Biopython?
“With the current Python steam in the non-bioinformatics and bioinformatics community is very sad to see BioPython not evolving”
Well, I would say that your view is rather biased. If you have been following the lists as you claim, you may have noticed that Biopython is evolving and adding many new features There are many of those as you can see in the release notes. “Not evolving” is not the same as “not migrating from CVS to git/bzr/SVN”.
Finally, the discussion on migration is actually in full steam, with the decision that has been made to migrate, and only the mechanism to ensure a smooth transition.
“BioPython is CVS, because they “need to care for the legacy developers”
Can you point where something like that has been said?
March 14th, 2009 at 11:52 pm
Not upset, it’s sad the state things are, just that. A nice programming language should have more support in the bioinformatics community. Can’t you see why some people do not contribute to it?
I follow the list, and a lot of the people I saw contributing in the past, left. I thought of doing so, but decided no to after following the list for some time.
I’m not comparing Perl and BioPython, but BioPerl and BioPython. Perl is a languange that is losing evidence and losing ground to Python and Ruby, and still BioPerl is a heck of a good framework, well oiled and working great. Sad that BioPython cannot gain traction from the main language. I don’t know for how long BioPython will be second (in your estimates) but maybe not for long.
So I guess it’s more important to have a core group of people doing all the changes instead of having a large group of contributors working together improving minor things, such as updating to 2.6 and 3.0. You understand the way things are, core contrubutors work more and on less aspects of the project? Or you are that short-sighted?
Again, I point to the video from RoR commits to the repository. The explosion of contributions after it was migrated to Github is the main proof of what should be done with every open source project.
I used the wrong words. The correct is “old-school developers” and “smooth transition”. Yep, science is always on the cutting edge …
March 14th, 2009 at 11:53 pm
BTW, nobody pays me to write this site too. If you want to help me pay the ISP, just click on the advert. Thanks.
March 15th, 2009 at 11:48 am
Notice that there are already some experiment of migration of the biopython cvs to github and bazaar.
For example, I have created a test repository in github:
- http://github.com/biopython/biopython/tree/master
and also tried some experiments on creating experimental branches (see the ‘Network’ tab in the previous link).
Basically, it is not technically too difficult to mirror an official branch of development with two mirrors in github and bazaar.
The problem, I think, is that nobody has enought experience in doing that practically, so we have to wait that the core biopython devs get used to a different rcs.
Anyway, my impression is that Peter and the other are positive on this change, only they want to do it without hurry.
p.s. I also published a blog post similar to this… (http://bioinfoblog.it/2009/02/biopython-looking-for-a-new-vcs/) I started writing another to describe the github experimental migration, but forgot to publish it
March 15th, 2009 at 11:57 am
Yes, I’m following it there. Unfortunately is not the “official” repository, but a well thought effort.
March 16th, 2009 at 5:46 am
Regarding comment 11, “old-school developers” was in the context of evaluating the git-cvsserver tool, which would provide a CVS interface to a git repository. Your usage is out of context, see:
http://lists.open-bio.org/pipermail/biopython-dev/2009-March/005470.html