Reposturgeon and Santa Claus Against The Martians!
Here’s a late New Year’s gift for all you repository-editing fiends out there: the long-awaited and perhaps long-dreaded reposurgeon 3.0.
In Heads up: the reposturgeon is mutating! I described the downside of a strategy of incremental small language changes aimed at preserving compatibility: you can wind up trapped by suboptimal early decisions. Sometimes, you have to bust out and do the big redesign, which I did and why there’s a bump in the major version number (the last time that happened was when reposurgeon got the ability to read Subversion dump files directly).
The biggest change is that the command language syntax has mutated from VSO to SVO. What? You’re not up on your comparative linguistic morphology and gave no idea what I’m talking about? That’s Verb-Subject-Object to Subject-Verb-Object.
Before 3.0 the order of syntactic elements in a command was: action verb first, then (for most commands) an event selection set, then (for some commands) an object like a directory or repository name. Now the selection set always comes first, followed by the action verb, followed by any object-like arguments.
This change makes the syntax more regular and easier to describe. Easier mainly because there is no longer any of the previous confusion, when a selection set was present after the command verb, over what the first argument of the command was. The selection set, or what came after it? (Correct answer: what came after.)
In making this change I am moving closer to a Unix design archetype that had already influenced reposurgeon pretty heavily: ed(1). ed had a horrendously awful UI by modern standards, but it was (and still is) great for scripting. If you think of ed as a record editor for which the records are text lines, and study its selection syntax, the influence – and the reasons ed makes a useful model for what reposurgeon is doing – should be obvious.
A significant new feature is that reposurgeon now has a user-definable macro facility. I have written in the past that these are generally a bad idea and I still think that’s true in general. (One representative major problem with them is that when macro expressions cross certain kinds of syntactic boundaries in the base language they often become a serious impediment to readability and maintainability.)
But I found I wanted macros while converting the groff repository, and reposurgeon’s base language is simple in some ways that make the obscuring effect of macros less dangerous. There are no analogs of the “++” postfix operator which in C makes “#define square(x) (x)*(x)” such a wonderful way to generate unanticipated side effects. (Hint: consider what happens when you say “square(a++)”. How many times will a be incremented, again?)
Many small irritations in the language have been fixed. “delete” now really means delete and is no longer overloaded with several variants of a commit-squashing operation; that is now “squash”. (Yes, this adopts some git terminology.)
Pathset syntax is now simpler and more powerful. For starters, pathsets now match not only commits touching matching paths but the content blobs that the paths point at (you can select either subset by qualifying with the =C or =B selectors). This is particularly useful in connection with the ‘filter’ command, which allows you to modify comments and blobs by passing them through a user-specified filter.
There are lots of other changes as well. If you have worked with reposurgeon before you’ll have a bit of relearning to do. Sorry about that, but experience has taught me that (when you can get away with it at all) one big, obvious compatibility break is kinder than a long-drawn-out series of little ones that leave everybody wondering what the feature set of the week is,