This repository has been archived on 2017-04-03. You can view files and clone it, but cannot push or open issues/pull-requests.
blog_post_tests/20100313172207.blog

67 lines
5.9 KiB
Plaintext

Subversion to GIT Migration: A Tale of Two Gotchas
<p>I&#8217;ve been wanting to migrate the GPSD codebase off Subversion to a distributed version control system for many months now. GPSD has a particular reason for DVCS; our developers often have to test GPSD sensors outdoors and aren&#8217;t necessarily in range of WiFi when they do it.</p>
<p>GPSD also needs to change hosting sites, for reliability reasons I&#8217;ve written about before. Though I&#8217;m a fan of Mercurial, I determined that moving to git would give us a wider range of hosting options. Also, git and hg are similar enough to make intermigration really easy &#8211; from SVN to either is 90% of the way to the other.</p>
<p>This blog entry records two problems I ran into, and solutions for them. One is that the standard way of converting repos does unfortunate things with tags directories. The second is that the CIA hook scripts for git are stale and rather broken.</p>
<p><span id="more-1806"></span></p>
<p>GPSD uses tags directories strictly for archival purposes. When we cut a public release XX, we make a tag with the name &#8220;release-${XX}&#8221; and never modify that tree copy afterwards. We don&#8217;t use tags as branches. So, when I migrated, I wanted the release tags to be mapped into git tag objects rather than branches.</p>
<p>Unfortunately, the git-svn extension can&#8217;t do this; it will turn your tags into git branches. I&#8217;m told svn2git has the same behavior. Here&#8217;s what I ended up doing:</p>
<p><b>git svn clone &#8211;stdlayout &#8211;no-metadata file://${PWD}/stage2-repo</b></p>
<p>The &#8211;stdlayout tells git-svn that the project has a stock SVN layout with trunk, tags and branches. It will tell the fetch operation to turn both tags and branches into git branches, then strip those three prefixes out of the repo paths. Then I did this:</p>
<p><b>git svn &#8211;ignore-paths=&#8221;tags&#8221; fetch</b></p>
<p>This prevented the tags directories from being turned into branches. But it meant I had to make the git<br />
symbols by hand. I wrote a script to extract the rev levels that looked like this:</p>
<pre><tt><b>
#!/bin/sh
#
# Get a table of tag releases and dates from a checkout directory
dir=$1/tags
for x in $dir/*
do
base=`basename $x`
info=`svn info $x | grep Last`
rev=`echo $info | sed -n '/.*Last Changed Rev: \([0-9]*\).*/s//\1/p'`
date=`echo $info | sed -n '/.*Last Changed Date: \(...................\).*/s//\1/p'`
echo "$base\t$rev\t$date"
done
</b></tt></pre>
<p>Running it gave me a table that looked like this:</p>
<pre><tt><b>
release-2.21 1566 2005-04-12 20:10:40
release-2.22 1592 2005-04-25 17:01:53
release-2.23 1637 2005-05-04 14:07:39
release-2.24 1688 2005-05-17 12:48:47
release-2.25 1737 2005-05-21 00:19:51
</b></tt></pre>
<p>The columns are tag name, Subversion revision level, and date-time stamp. I then went through the SVN and git versions of the logs and added git IDs as a fourth field. that gave me a file that looked (in part) like this:</p>
<pre><tt><b>
release-2.21 1566 2005-04-12 20:10:40 1dd11f752275842a220ce5b2b93da2e2fa31a53c
release-2.22 1592 2005-04-25 17:01:53 d11c967125b8432e7d906fba18d67b3b2e7feaad
release-2.23 1637 2005-05-04 14:07:39 70e3d9e0ed7e2676554735ccfce8a4dd46b8bd9c
release-2.24 1688 2005-05-17 12:48:47 4fcd4e7bebfbf587c2889657ba94b79f6ace2859e
release-2.25 1737 2005-05-21 00:19:51 e2816c19964d124381a90d9338530a17ce47d43
</b></tt></pre>
<p>Note: I&#8217;d have written a tool to generate the entire thing, but I estimated that for only about 45 tags it would take less time to hand-hack the list. Then I applied the following script:</p>
<pre><tt><b>
#!/usr/bin/env python
#
# Apply a table of tag releases and debates from a checkout directory
#
import sys, os
for line in file(sys.argv[1]):
(release, rev, date, time) = line.split()
os.foo('GIT_COMMITTER_DATE="%s %s" git tag -m "Tag for public release." %s' % (date, time, release))
</b></tt></pre>
<p>The &#8220;foo&#8221; in there should actually be the word &#8220;system&#8221;, but if written that way WordPress thinks it&#8217;s an attempt at malicious code injection and barfs.</p>
<p>I think this was more work than I should have had to do. When stdlayout is enabled, the conversion tools should know that SVN tags have different semantics than SVN branches and automatically lift to tag symbols if the tag tree has not been modified.</p>
<p>The second problem I ran into is that the git hook scripts CIA.vc supplies for git are badly out of date. Modern git installations don&#8217;t put all the helper commands in the normal $PATH; I had to add these lines to the shell hook script to make it work.</p>
<p> export PATH<br />
PATH=&#8221;$PATH:`git &#8211;exec-path`&#8221;</p>
<p>The Perl script has a similar problem. Investigating further I found that the CIA local copies of these scripts are very stale; they need to refresh from the upstream maintainers. </p>
<p>These were just speedbumps; the git repo is working fine now, and I&#8217;ve shut down SVN. I hope this note will be good googlebait for anyone who trips over the same problems.</p>
<p>UPDATE: The author of the gitorious project svn2git alleges that his tool can do this. But you have to write a rules file, and he admits the tool is &#8220;not well documented&#8221;. Do not confuse this with the tool of the same name on github, which is but a thin wrapper around git-svn&#8230;</p>
<p>UPDATE2: I fixed the CIA git scripts. They live in the official git repo now.</p>
<p>UPDATE3: <a href="http://esr.ibiblio.org/?p=1806#comment-251908">This comment</a> is correct. Because I understood git internals poorly at the time, I missed a simpler way to do this job. Allow git-svn to do the conversion of tags into branches and then just move the tag files! </p>