Internal Subversion Externals

2006-11-23Comments

Subversion externals provide a simple way for a project to pull together components from more than one repository. This post shows how they can also be used to create modules which collect together components from the same repository.

An svn:externals example

This blog is built using Typo which is itself built on top of the Ruby on Rails application framework. If we peer into the Typo Subversion repository we can see how a tagged version of the Ruby on Rails code gets pulled in.

$ svn proplist --verbose svn://typosphere.org/typo/trunk/vendor
Properties on 'svn://typosphere.org/typo/trunk/vendor':
  svn:externals : rails http://dev.rubyonrails.com/svn/rails/tags/rel_1-1-6

The proplist command above lists the properties which have been set on a Typo repository URL, and in this case shows that the typo/trunk/vendor directory has an svn:externals property linking the Subversion URL http://dev.rubyonrails.com/svn/rails/tags/rel_1-1-6 to the local name rails. (Don’t be confused by the http:// protocol in the rubyonrails URL — it’s still a Subversion repository we’re linking to, it’s just one that’s served by Apache.)

The rails directory is not part of the Typo repository, as the following listing shows:

$ svn list svn://typosphere.org/typo/trunk/vendor
akismet/
bluecloth/
....
syntax/
uuidtools/

When we check out Typo, though, it fetches the tagged version of Ruby on Rails at URL http://dev.rubyonrails.com/svn/rails/tags/rel_1-1-6 and places it in a local directory called rails. Here’s what we see when we check the code out.

$ svn checkout svn://typosphere.org/typo/trunk/vendor

Fetching external item into 'vendor/rails'
A    vendor/rails/cleanlogs.sh
A    vendor/rails/release.rb
....

Some things to notice

Note here that we’re pulling in a tagged version of Ruby on Rails — not the main development trunk. The Typo developers sensibly choose to develop against a stable version of the Ruby on Rails framework. They could even have pulled in a particular Rails repository revision by including the revision number in the svn:externals definition (see svn help propset for details).

Note also that the working copy we get in the rails subdirectory retains its association with the host repository at http://dev.rubyonrails.com/svn/rails: if authorised to do so, we could modify this working copy and check changes back in.

A Project Hierarchy

Now consider a repository which is arranged into projects named blue_goat, red_bear, yellow_dog, … . Each project has a top-level directory beneath which are sub-directories for source code, tests, build files and documentation. If we check everything out, we end up with a working copy which looks something like this.

full working copy layout
projects
|-- blue_goat/
|   |-- build/
|   |   `-- build.xml
|   |-- doc/
|   |   `-- user_guide.pdf
|   |-- src/
|   |   `-- BlueGoat.java
|   `-- test/
|       `-- TestBlueGoat.java
|-- red_bear/
|   |-- build/
|   |   `-- Makefile
|   |-- doc/
|   |   |-- note.txt
|   |   |-- spec.html
|   |   `-- user_guide.pdf
|   |-- src/
|   |   |-- main.cpp
|   |   |-- red_bear.cpp
|   |   `-- red_bear.hpp
|   `-- test/
|       `-- regression_test.sh
`-- yellow_dog/
    |-- build/
    |-- doc/
    |   `-- user_guide.rst
    |-- src/
    |   `-- yellow_dog.py
    `-- test/
        `-- test_yellow_dog.py

To save on screen space, I’ve shown only three projects and a tiny subset of the files in these projects. In reality, there are tens of thousands of files, and, since some of the test files are rather large, they occupy several gigabytes on disk.

For the developers, this is fine. Typically developers are assigned to one project at a time, and they check out a working copy for that project only. For the technical author, it’s a different story.

The Technical Author

A single technical author is responsible for the documentation for all active projects. Like every one else on the team, the author uses version control; in contrast to everyone else on the team, the author is interested in just a single sub-directory of every project — namely the doc directory.

Here’s what we can do. First, create and checkout a collected_docs directory.

$ svn mkdir svn://svnserver/collected_docs -m "Collected documentation."
$ svn co svn://svnserver/collected_docs

Now set up the desired links to project subdirectories. We’ll put them in a temporary file for now.

$ cat > /tmp/externals.props
blue_goat svn://svnserver/projects/blue_goat/doc
red_bear svn://svnserver/projects/red_bear/doc
yellow_dog svn://svnserver/projects/yellow_dog/doc

Next, use this file to set the svn:externals property on the new collected_docs directory, and check this change in.

$ svn propset svn:externals -F /tmp/externals.props collected_docs
property 'svn:externals' set on 'collected_docs'

$ svn commit -m "Added links to project documentation."
Sending        collected_docs

Committed revision 4567.

When we update collected_docs we get the documentation directories.

$ svn update

Fetching external item into 'collected_docs/blue_goat'
A    collected_docs/blue_goat/user_guide.pdf
Updated external to revision 4567.

Fetching external item into 'collected_docs/red_bear'
A    collected_docs/red_bear/note.txt
A    collected_docs/red_bear/user_guide.pdf
A    collected_docs/red_bear/spec.html
Updated external to revision 4567.

Fetching external item into 'collected_docs/yellow_dog'
A    collected_docs/yellow_dog/user_guide.rst
Updated external to revision 4567.

Updated to revision 4567.

As a result, the technical author’s working copy contains just what’s needed.

collected_docs
|-- blue_goat/
|   `-- user_guide.pdf
|-- red_bear/
|   |-- note.txt
|   |-- spec.html
|   `-- user_guide.pdf
`-- yellow_dog/
    `-- user_guide.rst

Have we forked the documentation by doing this? No — the externals defintions act like soft links, so any changes made in the collected_docs working copy appear in the project directory like they’re supposed to, and vice-versa.

Limitations

As you’ve probably noticed, even though we used an internal external, we still had to supply a fully qualified repository URL. Attempts to use a relative path will fail (that’s to say, we can set the property, but an attempt to checkout the external fails complaining about an Unrecognized URL scheme). So if we want to use this technique to tag and branch subsets of a repository, we’ll need to write a wrapper script.

A second limitation is that if someone decides to move one of the externals endpoints, again, our collected_docs fail to check out.

$ svn move svn://svnserver/projects/yellow_dog/doc \
         svn://svnserver/projects/yellow_dog/documentation \
    --message "No abbreviations, please"

$ svn update collected_docs
...
Fetching external item into 'collected_docs/yellow_dog'
svn: Target path does not exist