Andreas Jacobsen's Distraction

Another cause of procrastination

Subversion sucks, get over it

The defacto standard for open source version control systems has been Subversion for the last several years. While CVS is still in use some places, Subversion is miles ahead. While Subversion has served many people well, it has some failings that make it inappropriate for several project classes. The most important of these are open source projects. This post is going to look at why Subversion sucks for open source projects. I’ll look at how these arguments also apply to internal business source code management in a future post.

The primary problem with Subversion is the centralized repository. This manifests itself in several ways. Firstly, you must have operations level access to create a new project repository. Secondly, you must have commit access to touch the history of a project. Thirdly, developers are dependent on the project infrastructure to contribute. There are probably more, but today I’ll talk about these.

Creating new Subversion project repositories

Creating a new Subversion repository requires access to the svn-admin command on the box running a project’s subversion repositories. This means access (possibly indirect) to a shell account. This raises the bar quite high to be able to create new repositories. This might not seem like a big deal. There’s even an ugly hack pattern to work around it. Instead of creating new repositories, organizations put everything in the same Subversion repository. An example of this anti-pattern can be seen in the ASF Subversion repository. This is plain bad design. Navigating through these massive repositories is a pain, dealing with commit access becomes a much more vast security issue and the structure of the trunk/tags/branches pattern is broken.

Touching project history

Touching project history might seem like a holy right that should be reserved vetted people, but this is wrong. Users, not project leads, are the final deciders of code value. Political differences in a project should not impact what code is finally distributed. Maintaining patches out of tree violates the fundamental premise of source code management systems; That source code management should be automated, and not done by hand. Source code management systems that encourage out of tree maintainers to abandon source code management are therefore very problematic.

The other assumption is that an official project contributor is always more qualified than a non-contributor has been shown to be false several times. In fact, it’s a central premise in the free software movement, the open source community’s Right To Fork and the basis of any free market paradigm. Relying on a source code management system that has a centrally controlled access list therefore runs fundamentally counter to ideals that contribute to software quality. This doesn’t imply that Subversion leads to worse software, or that it isn’t reconcilable to these ideals through clever workarounds, but the dissonance is there and needs to be addressed.

Dependence on infrastructure

The third disadvantage of a central repository is that the lack of local history means one relies on infrastructure availability for source code management. There are primarily two situations where this is important: when the infrastructures fail or when they are unavailable. Infrastructure failure can happen if a server goes down, if a local internet connection fails or a host of other events that affect access to the central repository. Being able to continue to perform source code management under these conditions is important, because infrastructure failure will happen. For open source projects this is important because time is the most valuable asset a developer can contribute.

Other than infrastructure failure, developers are often able to code in places where infrastructure simply isn’t available. Internet access is growing more and more ubiquitous, but there are still places to code that don’t have access. Whether it’s on an airplane, train, in a car or at a cafe without wifi, there are times when project infrastructure simply isn’t available and as previously mentioned, time is the most valuable asset of an open source project.

The alternative: Distributed source code management

My distributed source code management system of choice is Git, but that doesn’t mean it’s right for you. The popular choices these days are Git, Mercurial and Bazaar. There are others, with tradeoffs of their own.

While distributed source code management systems don’t solve how to create central project repositories, they make repository creation trivial. This is a big deal. It means that you can start an experimental project with full source code management without polluting the namespace of central repository. Instead of using the stupid One Big Repository anti-pattern, repositories are cheap things that can be created and destroyed on demand. Some work must be done to make central repository hosting easier, which has given rise to services like GitHub, BitBucket (Mercurial) and Launchpad (Bazaar). These are great ways to trivially host open source projects. Since they’re offered as free services to open source projects, the need to maintain any repository oriented infrastructure simply melts away.

The way distributed source code management systems deal with commit access is ingenius. Since anyone can create history, but a project lead still owns their repository, the project lead can pick and choose history elements rather than digging through patchesets. Instead of sending a patch over email, someone can maintain a fully revisioned repository and send individual commits. This reduces the load for both contributor and project lead, as well as supporting the old commit access structure.

Distributed version control systems give people the ability to maintain a full project history along with patchsets out of tree as the default mode of operation. The issue of touching history simply goes away.

Since these distributed systems give full repository access locally, the dependence on infrastructure falls away, allowing people to continue to work during infrastructure failures or in areas without access to infrastructure and sync their changes back when they finally become available again.

There are other advantages of these systems over Subversion, but these are the ones related to the core assumption of centrally hosted revision control versus locally hosted revision systems.

The business end of things

So far the assumption has focused on open source projects, but almost all these points apply in some fashion to the business case as well. The cases are more varied and not necessarily as clear, but they are all there. I’ll look at these issues in a future post.

About these ads

Written by Andreas

October 26, 2008 at 15:21

Posted in version control

Tagged with , ,

34 Responses

Subscribe to comments with RSS.

  1. ClearCase is much better than CVS, Subversion, and SourceSafe. Too bad it is not free.. but it is the best.

    Phil

    October 26, 2008 at 20:06

  2. So far the assumption has focused on open source projects, but almost all these points apply in some fashion to the business case as well.

    No, they don’t, and you don’t know what you’re talking about when it comes to business requirements.

    ak

    October 26, 2008 at 20:30

  3. Yawn…yet another bashing of Subversion. How about get over yourself? If you don’t like svn don’t use it. There that was much shorter and to the point. Use the VCS that fits your needs.

    Robert

    October 26, 2008 at 20:32

  4. @Phil: Some people at work use ClearCase for their project and are happy with it. Since I haven’t used it, it’s difficult for me to comment.

    @ak: I haven’t talked about the business requirements yet, I’ll be looking forward to proving you right in a few days time.

    @Robert: I’m forced to integrate with svn at work, but I only really use git these days. It is the VCS that fits my needs. This blog post (and in particular the next one I plan to make) are to clarify my thoughts on why I think Subversion is past it so I can explain why and (more interestingly) when git is better. This is something I need to do for myself, I share it in public in case someone else finds value in it and so that I can get feedback (thanks!).

    Andreas

    October 26, 2008 at 23:23

  5. lol @ the svn apologists. there is a reason so many folks bash it.. that reason rhymes with “MIT DUCKS”

    only complete morons and php devs still care about it

    nonsvnapologist

    October 26, 2008 at 23:32

  6. So, the standard question: what do you know that Google doesn’t? (code.google.com is all svn).

    cypher

    October 26, 2008 at 23:52

  7. @cypher While code.google.com uses SVN exclusively (for the time being anyway), some Google projects use Git. Android is the most recent example.

    Andreas

    October 27, 2008 at 00:03

  8. Subversion works well and is used by a LOT of individuals and companies. Get over it.

    rich

    October 27, 2008 at 00:29

  9. @rich: Subversion doesn’t work as well as it could. I’m fine with it being used a lot. Like I said earlier in the comments, I’m interested in figuring out when it shouldn’t be used by people around me.

    Andreas

    October 27, 2008 at 00:33

  10. You missed the DVCS evangelism bus by about a year. You’re such a non-conformist!

    Brian

    October 27, 2008 at 04:28

  11. LOL…gotta love tech retards who argue ‘dont like it, dont use it’ & ‘get over it’ in opinion writeups about tools and their best uses.

    I found your post informative, keep it up.

    Satir

    October 27, 2008 at 04:31

  12. I know several people who use ClearCase for projects at work, but not a single soul who actually _likes_ it.

    cloud case

    October 27, 2008 at 05:24

  13. To nonsvnapologist.

    You forgot the first rule of intellectual honesty: Do not overstate the power of your argument.

    “If someone portrays their opponents as being either stupid or dishonest for disagreeing, intellectual dishonesty is probably in play. Intellectual honesty is most often associated with humility, not arrogance.”

    salamander

    October 27, 2008 at 06:13

  14. …or we could just summarize the argument the way Linus Torvalds did when he presented Git at a Google Talk… if you don’t agree with me, then you’re stupid.

    ankushnarula

    October 27, 2008 at 06:13

  15. I’ve just started using git and I actually like it. That said, Andreas, you’re not giving a fair comparison here.

    Apache uses Subversion exactly as it was designed and specifically to limit the number of shell accounts necessary. Many of the Subversion developers _are_ Apache developers and so they would know a thing or two about repository design, about how revision control is used in Apache, and what we need there for the types of communities at Apache.

    In fact, one of the issues with git is that it cannot handle large repository sizes, requiring you to split each project into a small repository which leads to its own set of headaches.

    Moreover, the subversion team is still adding features and I wouldn’t be surprised if we see local commits within the next version or two. So don’t write it off yet.

    Finally, ClearCase is a nightmare. Stay away from it. Have anyone who suggests it committed to an institution.

    Aaron

    October 27, 2008 at 08:06

  16. @ankushnarula I don’t have Mr. Torvalds’s track record quite yet, so I’ll stick to at least the semblance of humility for a while ;)

    @Aaron It was a bit unfair to single out the ASF for their structure. The Scala repository isn’t much better. Why do people check in code straight into some projects, without the trunk/tags/branches structure at all, or into weird directories straight under the project? It might seem unfair to blame these things on the big repository. It looks to me like it’s a laziness in the Scala project’s case. It just wasn’t worth creating the correct directory structure for some things. In the ASF’s case, it looks like it might be due to the strict access control. If you only have one repository, that’s what you’re going to be using for revision control.

    I guess I don’t understand why it would be better to have everything in a big repository when the projects are clearly demarked anyway. Perhaps you could explain this to me?

    That said, you’re right, I was a bit unfair in that section of the comparison. Someone on reddit pointed out how the commit access control is done, and it seems ok.

    As for Subversion’s improving feature set… They have a lot to catch up. Don’t get me wrong, I’d love to see Subversion with all its tools and mindshare get up to speed with the new VCSes. That could take a while, if the 1.5 release cycle was anything to go by, but I look forward to seeing what they come up with.

    Andreas

    October 27, 2008 at 09:57

  17. The advantage of having many projects in a single repository is that code can move around between projects without losing history. Think of the ASF’s “incubator” model, where projects eventually get promoted and move around. Another advantage to single huge repository is ease of administration: only one set of hook scripts to manage, one access file, one thing to do incremental backups of, etc.

    Cheers!

    — an svn founder, who loves hg

    Ben Collins-Sussman

    October 27, 2008 at 13:42

  18. Aha. Personally, I think the ease of administration is a bit of a red herring; it eases the administrator side of things, but not the user’s ability to administrate. But if the user doesn’t want/need to administrate that doesn’t really matter. Thanks for clearing that up.

    Incidentally, there is open source software (gitorious) that handles most of the aspects of this for git repository hosting sites. I haven’t used gitorious though (I’m a hopeless github addict), so I can’t vet it though.

    Andreas

    October 27, 2008 at 14:29

  19. SVN serves it’s purpose. Let’s not all jump on the DVCS bandwagon just because it’s cool kids. Plus, you can wrap Git around SVN, and your problem is solved.

    Seth Cardoza

    October 27, 2008 at 16:50

  20. Read up, I already use Git’s SVN wrapper. It’s not satisfactory. I don’t use Git because “cool kids like it”. I use Git because I get a better user experience from it. I heartily recommend anyone who thinks SVN is “current” or “good enough” to try a DVCS for a serious amount of time. The key upshot for me was the opening of new workflows that I simply couldn’t achieve naturally with SVN.

    Andreas

    October 27, 2008 at 17:05

  21. […] Subversion sucks, get over it « Andreas Jacobsen’s Distraction (tags: svn) Possibly related posts: (automatically generated)links for 2007-04-25WS-BPEL 2.0 now an OASIS Standard […]

  22. […] a comment » Andy Singleton posted a reply to (among others) my rant on Subversion sucking. The thought experiment is pretty interesting. I’m not convinced […]

  23. “So far the assumption has focused on open source projects, but almost all these points apply in some fashion to the business case as well. The cases are more varied and not necessarily as clear, but they are all there. I’ll look at these issues in a future post.”

    I know this is over 6 months old, but I was curious if you’d ever come up with the post to which you refer above?

    Mike

    August 18, 2009 at 20:57

  24. The answer to that is yes and no. I’ve written up a rough draft for the post, I’m not satisfied with it and I’m not sure I’d want to post it either way.

    That said, I’ve been thinking about this again recently and the discussion really just boils down to an opportunity cost issue for me. Anyone who’s using Subversion is avoiding the cost of change, but also paying the cost of lost ways to increase efficiency on their use of version control. Any argument from the business standpoint will have to be based on that.

    I’m also annoyed with the ‘average devs’ strawman, but in the end that just boils down to the cost of change thing.

    Andreas

    August 19, 2009 at 00:11

  25. A recent conversation had about SVN:

    web1:/usr/local/zend/apache2/htdocs/workspace/ben # svn cleanup
    svn: ‘application/layouts’ is not a working copy directory
    web1:/usr/local/zend/apache2/htdocs/workspace/ben # svn cleanup
    svn: ‘application/layouts’ is not a working copy directory

    (1:38:22 PM) Aj: oh I’m going to stab svn
    (1:38:40 PM) Aj: grrr

    (1:44:07 PM) Aj: okay everything is fixed

    … The sad truth of version control systems is that these precious minutes in this case 6 minutes are lost over and over again … instead of being unobtrusive version control is intrusive modifying the way you work, the way you store and organize your projects.

    The server model:
    Subversion and all revision control miss the mark because the server-model is important when it comes to revision control, and code access. Moving a project into a working space, and porting the project, editing it, and differencing and controlling file revisions over a timeline (yes we do work in time) is extremely important.

    Things like iNotify exist, diff exists, and many other linux server tools exist.

    Subversion doesn’t integrate with any server-architecture and this is why it fails and sucks.

    It’s crappy and complicated and confusing to teach new developers to use, and it adds an unnecessary overhead to a project.

    Subversion and all revision control systems fail to solve the CDP or “Real time versioning” problem.

    only iNotify, or consistently and constantly examining file system changes is capable of this.

    Version Control needs to address conflicts at-the-server and employ a workspace model. Conflicts shouldn’t be dealt with at the time of replication – what subversion calls “committing”. The client shouldn’t issue a commit the server should issue the commit after the source has already been exchanged. Version control should manage versions for multiple instances of the same project and consolidate them on a schedule, or driven by a user.

    In general version control fails completely for non-technical users… Version control file-systems are a good replacement, but this doesn’t truly work for coding environments where the idea of a “project” and “application” exists.

    Ben

    September 28, 2009 at 22:28

  26. […] really fun to read page about someone that really hates Subversion too. I think I mentioned some other bad articles about subversion before. I think Subversion is ok if you just want to make a simple checkout from […]

  27. If it were only six minutes. I just made a wrong svn switch (forgot the last component in the path; trunk/c to branches/wawa instead of branches/wawa/c), Ctrl-C’ed it after I saw Makefile disappear. After that svn switch or update call for cleanup, and cleanup does not work. Thus I had to re-checkout everything — over GPRS. Real fun ensued. (The big subtree was already gone in the old sandbox, so rescuing it wouldn’t be any faster.)

    Now if it were as easy to kill repositories…

    Andreas Krey

    November 17, 2009 at 12:46

  28. Let’s see SVN do this:

    Create a complete base tree of source. Create a sub project and Call it DEV.
    Create another sub project off the root called QA. Now, SHARE all of the files from DEV into QA. Call a batch file that PINS everything.
    Create a sub project off of the root called RELEASE. Share all the files from DEV in QA and call a batch file that PINS everything.

    If I fix bugs, I do it in dev. I “promote” to QA by simply un-pinning the file in QA and re-pinning at the new version. My builds all run off of QA source.

    When QA gives the green light, everything in RELEASE is unpinned and re-pinned at the QA version.

    This is idiot-proof. I know, I have worked with so many idiots it’s hard to count. You SVN guys, I hope to hell you don’t make a mistake merging all of that spaghetti. Good Luck.

    And, as if having to run a utility once a week to compact your database is such a big deal. Wusses.

    Brian M.

    Brian M

    July 2, 2010 at 08:05

  29. @Ben, comment #128: to solve what you call the “problem of real-time versioning” one would use a
    versioning filesystem, not a version control system, IMHO. I for one like determine what goes together in a
    changeset or revision, what the commit message reads, and don’t like my every file save to be versioned,
    as it would be with inotify (at least not in the or a project’s repository).

    mabri

    July 17, 2011 at 13:41

  30. ur stupid

    anonymous

    March 7, 2012 at 19:55

  31. 99% of code written is mundane uninspired hacks anyway, so really I think it’s programmers who have to get over themselves. It doesn’t matter where you store your code, it sucks, get over it. I think it’s a hoot that wars develop over a simple code repository, do you really think you matter all that much?

    arealprogrammer

    March 16, 2012 at 13:06

  32. I hear a lot of “SVN doesn’t suck. Proof ? lots of people use it !”
    OMG! Jeez you’re right ! When was the last time some tool was still widely used in the industry and yet there were much better alternatives available ? I’m pretty sure it never happenned !
    Oh wait… It happened last second, and every second before since industry exists.

    The industry doesn’t use the most powerful or usable tool available at any given time.
    It uses the tool that doesn’t disrupt the existing workflows because hey : we’ve always been doing it this way why change ?
    Managers don’t take risks. Especially on changing from a tool they didn’t understand but magically “works” from their perspective, to another newer tool that they understand even less.
    In any big corporation you don’t choose your VCS you use the one your boss tells you to use.

    If you don’t think SVN sucks then come back with technical arguments demonstrating that what the OP said was wrong/incomplete, but don’t give us bs about how big companies still use SVN.
    They also still use Windows XP, but it sucks, it sucks SO SO BAD. It was Ok when it came out and now it is an old decrepit piece of garbage.

    felix

    April 12, 2013 at 10:00

  33. I was a user and administrator of Clearcase for about 10 years, and I’ve been the SVN admin/helpdesk for about 100 developers for the past 2 years. About the only advantage I can find with SVN is that it’s free, and somewhat simpler to administer. But it was truly a huge step backwards from Clearcase in functionality and user experience. I’m always surprised when I read “git vs svn” debates and they only seem to focus on centralized vs distributed. IMHO, the worst part of SVN is how they decided to handle directories, filenames and renames. Clearcase was simply brilliant at handling file renames, graphical version trees showing all branches, merging, rebasing, etc. My 100 developers work very much in parallel on common files, supporting multiple releases, and I can tell you merging and propagating in SVN is an absolute nightmare, largely due to fundamental design flaws in filename handling. Mistakes are made all the time, much more so than were made in Clearcase, and these are seasoned software engineers. I’m truly hoping git improves on the renaming and merging issues, but I never see that discussed in the comparisons.

    Jay

    April 17, 2013 at 19:13

  34. its 2014. sadly svn is still around, and sadly im still forced to use it for work. when will these dinosaurs go extinct?

    Taderp

    February 27, 2014 at 17:35


Comments are closed.

Follow

Get every new post delivered to your Inbox.

%d bloggers like this: