On Learning Haskell
I’ve had learning the computer language Haskell on my to-do list for some time. I’m actually stepping up to learn it now, thanks to a temporary lull in my other activities and a vicious cold that has left me disinclined to strenuous work. I may associate Haskell with the taste of zinc gluconate for the rest of my days; both have an astringent and medicinal quality and produce fervent claims of effectiveness about which I am moderately optimistic but which I have not yet personally verified.
Haskell is very obviously a language built by mathematical logicians for mathematical logicians. Category theory lurks behind it in the same way that the lambda calculus lurks behind LISP. The following is an effort to make Haskell’s way of carving up the world intelligible, written partly to clarify my own thoughts.
Haskell is built around a handful of powerful primitive concepts and a pons asinorum. The most pervasive of these, purely functional programming without side effects or variables, is generally described in introductions to Haskell as the most difficult barrier for most programmers arriving from conventional imperative or OO languages like C, C++, and Python. And yes, if your mental model of programming is all about for-loops and static variables — or, for that matter, objects! — learning to think in Haskell is going to be quite a wrench.
There are reasons to make the effort. Haskellers (Haskellites? Haskellians? Haskellators? Haskelletons?) maintain that imperative programming relies heavily on side effects that push proving the correctness of programs to somewhere between impractically difficult and impossible. They also like to point out that side effects make programs difficult to automatically parallelize across multiple processors, an increasingly important consideration as multicores become the rule rather than the exception.
Both are solid arguments. It’s less a wrench for me to give up imperative thinking than for most because I’m a Knight of the Lambda Calculus from way back — and, while LISP is not a purely functional language, anyone who programs seriously in it will develop a feel for the functional style. Accordingly, while hanging out on the #haskell IRC channel I recommended to someone newbier than me that he might consider learning some LISP first and then coming back to Haskell. None of the hard-core Haskellians on the channel demurred, and I think this is probably good advice in general.
Now I will admit that the preceding paragraphs contained two fibs. First: Haskell does have variables, sort-of kind-of. But such a “variable” can only assigned once and the value of the variable is a promise to re-evaluate the expression on the right side of the assignment whenever the variable is evaluated; it behaves more like a safe macro or an Algol-60 call-by-name than like what people used to modern imperative languages call variables. Second: it is possible to define operations that have side effects, and things that have the behavior of real variables, using a construct called a monad.
I’ll get back to monads, but before I do I should introduce two other fundamentals of Haskell: static typing and lazy evaluation. Static typing shouldn’t be a strange concept to anybody who’s ever written in a compiled language like C or Java, but Haskell pushes the concept to some sort of logical limit. Variables need not have explicit types (they’re implicitly typed by the expression they’re shorthand for), but there’s a syntax that allows you to attach type signatures to any function, and the compiler does type inference from those. This has two consequences: it makes efficient compilation of the language possible (which is unusual for a language at Haskell’s level of abstraction), and (more importantly in the Haskell view of things) the type annotations assert invariants that can be used to prove the correctness of the program.
User-defined types are (more or less) the values of type-valued expressions (it’s actually more complicated than that, but thinking about it this way is a useful point of entry). The single most delightfully weird detail of Haskell I’ve run into so far is this: you can have type-valued variables, and write type-valued expressions that are recursive! For some types, such as trees, this is the natural way to do things.
Lazy evaluation is easier to understand. All it means is that the value of an expression is not computed until it’s actually needed by the caller; evaluation runs outside-in rather than inside-out. If you’re familiar with the idea of a closure from Scheme (or another language, such as Ruby, that has borrowed Scheme closures) it helps to think of Haskell expressions as returning closures. When the program runs, the outermost (main) closure is forced (evaluated); this may trigger the forcing of other closures (expressions and function calls) directly inside it, and so on through their sub-expressions which are also closures. Local color so you can sound like you know what you’re talking about: in Haskell-land, a closure is called a “thunk”.
Various optimizations make lazy evaluation efficient; notably, because expressions are (usually) pure, the closure can often be replaced by a pointer to its value and never need to be evaluated more than once. A benefit of lazy evaluation is that you can write code like an Icon or Python generator that spins forever, returning a value on each cycle, and it will only be called for the exact number of returns that the enclosing program actually needs even if the caller is itself a generator. This is one of the capabilities that replaces for-loops in imperative languages.
Even if most of your Haskell code is pure (no state, no side effects) it’s going to need to interface with a stateful world. I/O, in particular, is not pure; getting a line from your input source gives a result which will vary depending on the state of the world outside the program. Normal Haskell practice is to write your programs with as much pure code as possible and a minimum number of calls to functions with side effects (such as getting a line of input from standard input or putting a line of input to standard output). Only, because evaluation is outside in, what I/O function calls actually do is create closures that will have I/O side effects when they’re forced.
So, the question: let’s say we have multiple calls to functions generating closures with with output side effects. How do we write the code so the closures get forced in the order we want? There are multiple ways this could be done; the most obvious would require some kind of special construction that is an exception to “everything is function calls”. What Haskell uses for such sequencing is a monad.
Ah, monads. These are Haskell’s pons asinorum. Haskellers are in love with the fact that they actually behave like a recondite concept called “strong monads” from category theory; at this point in many expositions of the language the author would start waving this fact around. I’ve been a mathematician myself, I retain some grasp of category theory, and I say invoking it here is confusing and unnecessary.
A simpler way to think about monads is as a hack to turn function composition into a way of forcing the sequencing of function calls, or the functional-programming equivalent of a shell pipeline. And having said that provocative thing, I’m not going to go into the gory technical details required to make that actually happen.
I normally consider the syntax of a language, no matter how odd, to be a mere detail compared to more important things like its type ontology and how it does evaluation. But one central thing about Haskell’s syntax deserves attention: the way it uses productions. A Haskell function can be written in a sort of AWK-ish style as a series of pattern-action pairs; they’re tried, in sequence, and the first pattern to match the input fires. Falling off the end of the list yields a runtime error, but a wildcarded “everything else” production is easy to write.
Summing up: I don’t know what I’m going to use Haskell for, or indeed if I’ll ever use it at all. But the time I’ve spent wrestling with it has not been wasted; it has challenged my preconceptions, shown me some possibilities I hadn’t seen before, forced me to develop a practical grasp of some concepts like lazy evaluation that were previously only theory to me, and in general shook up my thinking. That’s valuable in itself. In How To Become A Hacker, I wrote “LISP is worth learning for [..] the profound enlightenment experience you will have when you finally get it. That experience will make you a better programmer for the rest of your days, even if you never actually use LISP itself a lot.” I think the same can be said of Haskell, and for very similar reasons.