Home About Eric Topics SourceGear

2005-02-24 20:08:13

Chapter 7: Branches

This is part of an online book called Source Control HOWTO, a best practices guide on source control, version control, and configuration management.

< Chapter 6 Chapter 8 >

What is a branch?

A branch is what happens when your development team needs to work on two distinct copies of a project at the same time.  This is best explained by citing a common example:

Suppose your development team has just finished and released version 1.0 of UltraHello, your new flagship product, developed with the hope of capturing a share of the rapidly growing market for "Hello World" applications. 

But now that 1.0 is out the door, you have a new problem you have never faced before.  For the last two years, everybody on your team has been 100% focused on this release.  Everybody has been working in the same tree of source code.  You have had only one "line of development", but now you have two:

It is important for these two lines of development to remain distinct.  If you release a version 1.0.1, you don't want it to contain a half-completed implementation of a 2.0 feature.  So what you need here is two distinct source trees so your team can work on both lines of development without interfering with each other.

The most obvious way to solve this problem would simply be to make a copy of your entire source control repository.  Then you can use one repository for 1.0 maintenance and the other repository for 2.0 development.  I know people who do it this way, but it's definitely not a perfect solution.

The two-repository approach becomes disappointing in situations where you want to apply a change to both trees.  For example, every time we fix a bug in the 1.0 maintenance tree, we probably also want to apply that same bug fix to the 2.0 development tree.  Do we really want to have to do this manually?  If the bug fix is a simple change, like fixing the incorrect spelling of the word "Hello", then it won't take a programmer very long to make the change twice.  But some bug fixes are more involved, requiring changes to multiple files.  It would be nice if our source control tool would help.  A primary goal for any source control tool should be to help software teams be more concurrent, everybody busy, all at the same time, without getting in each other's way.

To address this very type of problem, source control tools support a feature which is usually called "branching".  This terminology arises from the tendency of computer scientists to use the language of a physical tree every time hierarchy is involved.  In this particular situation, the metaphor breaks down very quickly, but we keep the name anyhow. 

A somewhat better metaphor happens when we envision a nature path which forks into two directions.  Before the fork, there was one path.  Now there are two, but they share a common history.  When you use the branching feature of your source control tool, it creates a fork in the path of your development progress.  You now have two trees, but the source control has not forgotten the fact that these two trees used to be one.  For this reason, the SCM tool can help make it easier to take code changes from one fork and apply those changes to the other.  We call this operation "merging branches", a term which highlights why the physical tree metaphor fails.  The two forks of a nature path can merge back into one, but two branches of an oak tree just don't do that.  I'll talk a lot more about merging branches in the next chapter.

At this point I should take a step back and admit that my example of doing 1.0 maintenance and 2.0 features is very simplistic.  Real life examples are sometimes far more complicated, involving multiple branches, active development in each branch, and the need to easily migrate changes between any two of them.  Branching and merging is perhaps the most complex operation offered by a source control tool, and there is much to say about it.  I'll begin with some "cars and clocks" stuff and talk about how branching works "under the hood".

Two branching models

Best Practice: Organize your branches

The "folder" model of branching usually requires you to have one extra level of hierarchy in your repository tree. Keep your main development in a folder named $/trunk. Then create another folder called $/branches. Each time you create a branch off of the trunk, put it in $/branches.

First of all, let's acknowledge that there are [at least] two popular models for branching.  In the first approach, a branch is like a parallel universe. 

In order to retrieve a file, you specify not just a path but the name of the universe, er, branch, from which you want the file retrieved.  If you don't specify a branch, then the file will be retrieved from the "default branch".  This is the approach used by CVS and PVCS.

In the other branching model, a branch is just another folder, located in the same repository hierarchy as everything else.  When you create a branch of a folder, it shows up as another folder.  With this approach, a repository path is sufficient to describe a location.

Personally, I prefer the "folder" style of branching over the "parallel universe" style of branching, so my writing will generally come from this perspective.  This is the approach used by most modern source control tools, including Vault, Subversion (they call it "copy"), Perforce (they call it "Inter-File Branching") and Visual Studio Team System (looks like they call it branching in "path space").

Under the hood

Good source control tools are clever about how they manage the underlying storage issues of branching.  For example, let us suppose that the source code tree for UltraHello is stored in $/projects/Hello/trunk.  This folder contains everything necessary to do a complete build of the shipping product, so there are quite a few subfolders and several hundred files in there.

Now that you need to go forward with 1.0 maintenance and 2.0 development simultaneously, it is time to create a branch.  So you create a folder called $/projects/Hello/branches.  Inside there, you create a branch called 1.0.

At the moment right after the branch, the following two folders are exactly the same:

$/projects/Hello/trunk

$/projects/Hello/branches/1.0

It appears that the source control tool has made an exact copy of everything in your source tree, but actually it hasn't.  The repository database on disk has barely increased in size.  Instead of duplicating the contents of every file, it has merely pointed the branch at the same contents as the trunk.

As you make changes in one or both of these folders, they diverge, but they continue to share a common history.

The Pitiful Lives of Nelly and Eddie

In order to use your source control tool most effectively, you need to develop just the right amount of fear of branching.  This delicate balance seems to be very difficult to find.  Most people either have too much fear or not enough.

Nelly is an example of a person who has too much fear of branching.  Nelly has a friend who has a cousin with a neighbor who knows somebody whose life completely fell apart after they tried using the branch and merge features of their source control tool.  So Nelly refuses to use branching at all.  In fact, she wrote a 45-page policy document which requires her development team to never use branching, because after all, "it's not safe". 

So Nelly's development team goes to great lengths to avoid using branching, but eventually they reach a point where they need to do concurrent development.  When this happens, they do anything they can to solve the problem, as long as it doesn't involve the word "branch".  They fork a copy of their tree and begin working with two completely separate repositories.  When they need to make a change to both repositories, they simply make the change by hand, twice.

Best Practice: Don't be afraid of branches

If you're doing parallel development, let your source control tool help. That's what it was designed to do.

Obviously these people are still branching, but they keep Nelly happy by never using "the b word".  These folks are happy, and we should probably just leave them alone, but the whole situation is kind of sad.  Their source control tool has features which were specifically designed to make their lives easier.

At the other end of the spectrum is Eddie, who uses branching far too often.  Eddie started out just like Nelly, afraid of branching because he didn't understand it.  But to his credit, Eddie overcame his fear and learned how powerful branching and merging can be.

And then he went off the deep end.

After he tried branching and had a good first experience with it, Eddie now uses it all the time.  He sometimes branches multiple times per week.  Every time he makes a code change, he creates a private branch. 

Eddie arrives on Monday morning and discovers that he has been assigned bug 7136 (In the Elbonian version, the main window is too narrow because the Elbonian language requires 9 words to say "Hello World".)  So Eddie sits down at his desk and begins the process of fixing this bug.  The first thing he does is create a branch called "bug_7136".  He makes his code change there in his "private branch" and checks it in.  Then, after verifying that everything is working okay, he uses the Merge Branches feature to migrate all changes from the trunk into his private branch, just to make sure his code change is compatible with the very latest stuff.  Then he runs his test suite again.  Then he notices that the repository has changed yet again, then he does this loop once more.  Finally, he uses Merge Branches to apply his code fixes to the trunk.  Then he grabs a copy of the trunk code, builds it and runs the test suite to verify that he didn't accidentally break anything.  When at last he is satisfied that his code change is proper, he marks bug 7136 as complete.  By now it is Friday afternoon at 4:00pm, and there's no point in starting anything new at this point, so he just decides to go home.

Eddie never checks anything into the main trunk.  He only checks stuff into his private branch, and then merges changes into the trunk.  His care and attention to detail are admirable, but he's spending far more time using his source control tool than working on his code.

Let's not even think about what the kids would be like if Eddie and Nelly were to get married.

Dev--Test--Prod

Once you established the proper level of comfort with the branching features of your source control tool, the next question is how to use those features effectively.

One popular methodology for SCM is often called "code promotion".  The basic idea here is that your code moves through three stages, "dev" (stuff that is in active development), "test" (stuff that is being tested) and "prod" (stuff that is ready for production release):

For a variety of reasons, I personally don't like working this way, but there's nothing wrong with it.  Lots of people use this code promotion model effectively, especially in larger companies where the roles of programmer and tester are very clearly separated. 

I understand that PVCS has specific feature support for "promotion groups", although I've never used this product personally.  With other source control tools, the code promotion model can be easily implemented using three branches, one for dev, one for test, and one for prod.  The Merge Branches feature is used to promote code from one level to the next.

Eric's Preferred Branching Practice

Best Practice: Keep a "basically unstable" trunk.

Do your active development in the trunk, the stability of which increases as you approach release. After you ship, create a maintenance branch and always keep it very stable.

Here at SourceGear our main development tree is called the "trunk".  In our repository it is rooted at $/trunk and it contains all the source code and documentation for our entire product.

Most new code is checked into the trunk.  In general, our developers try to never "break the tree".  Anyone who checks in code which causes the trunk builds to fail will be the recipient of heaping helpings of trash talk and teasing until he gets it fixed.  The trunk should always build, and as much as possible, the resulting build should always work.

Nonetheless, the trunk is the place where active development of new features is happening.  The trunk could be described as "basically unstable", a philosophy of branching which is explained in Essential CVS, a fine book on CVS by O'Reilly.  In our situation, the stability of the trunk build fluctuates over the months during our development cycle.

During the early and middle parts of a development cycle, the trunk is often not very stable at all.  As we approach alpha, beta and final release, things settle down and the trunk gets more and more stable.  Not long before release, the trunk becomes almost sacred.  Every code change gets reviewed carefully to ensure that we don't regress backwards.

At the moment of release, a branch gets created.  This branch becomes our maintenance tree for that release.  Our current maintenance branch is called "3.0", since that's the current major version number of our product.  When we need to do a bug fix or patch release, it is done in the maintenance branch.  Each time we do a release out of the maintenance branch (like 3.0.2), we apply a label.

After the maintenance branch is created, the trunk once again becomes "basically unstable".  Developers start adding the risky code changes we didn't want to include in the release.  New feature work begins.  The cycle starts over and repeats itself.

When to branch?  Part 1:  Principles

Best Practice: Don't create a branch unless you are willing to take care of it.

A branch is like a puppy.

Your decisions about when to branch should be guided by one basic principle:  When you create a branch, you have to take care of it.  There are responsibilities involved. 

Be afraid of branches, but not so afraid that you never use the feature.  Don't branch on a whim, but do branch when you need to branch.

When to branch?  Part 2:  Scenarios

There are some situations where branching is NOT the recommended way to go:

And there are some situations where branching is the best practice:

When to branch?  Part 3:  Pithy Analogy

Looking Ahead

In the next chapter I will delve into the topic of merging branches.

 


< Chapter 6 Chapter 8 >