The problem we had was with binary files which had (wrongly) been checked into CVS as text files. On import, by default, cvs2svn does a couple of things to text files which can seriously damage binary files:
native
, meaning again that
the binary file you check out may not be the one you checked
in, since Subversion makes sure end-of-line sequences are the ones
preferred by your client platform.
We'd messed up but fortunately we'd messed up in an immediately obvious way: a number of binaries were broken, to the point that they wouldn't even execute.
This is one of those mistakes you only make once (until you make it the next time and kick yourself even harder, that is). I guess we were lulled into a false sense of security: everything seemed to be working so smoothly ... Subversion is better than CVS at handling binary files ... everything had been working fine with CVS, so our CVS repository must be fine ... cvs2svn would spot any problems.
Of course, our CVS repository wasn't fine. We'd got away with binary files marked as text for the simple reason that most of these files had been used on Linux only.
What makes this mistake so chastening is the fact that a basic acceptance test of the new repository would have been both simple and scriptable:
#!/bin/sh
cvs co CVSARCHIVE fromcvs # Checkout from CVS, on the trunk
svn co SVNREPOS/trunk fromsvn # Checkout from SVN, on the trunk
diff -q -r fromcvs fromsvn > all_diffs # Spot the difference
If the all_diffs
file is empty, the CVS and Subversion checkouts are
byte-for-byte compatible.
Unfortunately the all_diffs
file wasn't empty. Remember those
keyword expansions? Subversion is clever enough to replace CVS version
numbers with its own revision numbers and as a result the files
differ when checked out. Keyword expansion really is a
bad idea!
Similarly, a number of text files were different because Subversion
had tidied up inconsistent line endings.
So, there were plenty of false hits as well as a list
of files we needed to run cvsadmin -kb
on.
Incidentally, we could have chosen to clean up the files during import
by passing some more parameters to cvs2svn: a suitable combination of
--mime-types=FILE
, --eol-from-mime-type
and --no-default-eol
options
would have done the job. We decided, though, that the proper solution
was to fix the root cause of the problem.
So, we had to delay by a day to reinstate CVS, run the text-to-binary corrections, re-run the migration, perform acceptance tests. This time we were more cautious and we also tested builds made from the clean Subversion checkout.
Copyright © 2006 Thomas Guest |