Why on Earth Should I Functional Program?

This blog post puts forward what I think is a solid case for adopting the principles of functional programming. It doesn’t cover the actual toolset that functional programmers gain; if anything it covers the toolset we ought to lose. The overall point is that code legibilitity comes from constraining ourselves as we write.

Higher Level Programming

goto was (thankfully, it’s mostly past tense now) a control flow operation that redirected execution to a labeled point in a codebase (or even a line number!). For example, this program increments numbers repeatedly.

int = 0
print(int)
int++
go to 1

We don’t see goto very often anymore, and we take structured programming as a default. Our programs run steps in order, using if statements for conditional branching, loops for iterating, and return to give us a value back. Dijkstra (of shortest path algorithm fame) wrote “Goto Considered Harmful”, a format aped by many on Hacker News until the logical capstone “Considered Harmful Essays Considered Harmful”. His justification for superseding goto with specialised control flow statements is as follows:

My second remark is that our intellectual powers are rather geared to master static relations and that our powers to visualize processes evolving in time are relatively poorly developed.

For that reason we should do (as wise programmers aware of our limitations) our utmost to shorten the conceptual gap between the static program and the dynamic process, to make the correspondence between the program (spread out in text space) and the process (spread out in time) as trivial as possible.

I think this logic is solid. We should make the program — and the process of what it does — transparent to the reader. We too often ignore that code is a form of communication between people, and that unreadable code can kill projects and badly delay development. When giving up goto, we gave up flexibility; no more arbitrary redirects, just if statements and loops with limited expressive power. In exchange, we gained legibility. Now we can see what a program does with less mental “spatial / temporal” untangling.

We should perhaps continue with this principle beyond structural programming. Below I propose some principles we should follow until they’d have us write something barbarous. Let’s not forget that codebases are ultimately things people need to read, understand, and ideally not despise. Code is communication; both with the machine, and your fellow developers. Let’s write well.

‘By relieving the brain of all unnecessary work, a good notation sets it free to concentrate on more advanced problems, and, in effect, increases the mental power of the race.’ – Alfred N. Whitehead

Let’s use the least powerful tool for the job.

We should use scalpels when sledgehammers aren’t needed. Our programs can be implemented many ways, but I believe we should follow this ordering of tool choices:

types
data
functions
statements
classes
metaprogramming

this listing prioritises simple, limited forms of computation over more advanced ones like metaprogramming (which can perform arbitrary program transforms and breaks all guarantees). Programs that are mostly static data plus a few loops are fairly legible. n interlocked class instances with local state are difficult to untangle in comparison, and just generally.

Why functions over statements? When a function exists that prepackages functionality (like map for applying elementwise transformations) we should use it rather than reimplement it ad infinitum. This is often the most emphasised part of why Functional Programming is Good, but it misses the broader point. We should always avoid reinvention, and we should always use the least powerful tool for the job. We stopped using gotos, and we can move on from structured programming too and operate at higher levels of abstraction. Writing at a higher level of abstraction from machine-code more clearly communicates your intent. By the same principle: avoid using folds (which generalise recursion across a container with an accumulator) and instead prefer more specialised functional operations like filter

Lets looks at some examples. What guarantees do we lose as we advanced from types to metaprogramming?

The basics; Types

Type-only programs aren’t terrible useful practically (for most people). Types communicate information before runtime to our language’s type-checker. The main guarantee type-based designs give (with caveats) is that all information needed is soundness-checking before your program even starts.

type Rating =  '⭐' | '⭐⭐' '⭐⭐⭐' | '⭐⭐⭐⭐' | '⭐⭐⭐⭐⭐'
type BandRating = {
  band: string,
  rating:
}

What we care about; Data

When I refer to data here, I mean static declarations of primitives / ADTs without additional class baggage like initialisers, validators, etc. Data declarations can potentially be checked by JSON Schema in addition to normal type-checking, but adds the complexity that it exists during runtime & we need to trace its scope, mutation, access, and GC tidy-up. Substructural type systems also help move some of these checks back from runtime into the type-checker, but they’re far from common.

Data generally is legible. Read the data, and you know what it is, forever (depending on how you code, keep reading…).

const example: BandRating = {
  'band':  'Your Favourite Band',
  'rating': '⭐⭐⭐⭐⭐'
}

What we unfortunately need; Code

The problems really begin here. We now introduce data transformations, so need to mentally track what data exists at a point of time, what state it’s in, how it reached that state, and what might happen to it later. We’ve added a temporal dimension too, as changes occur across time. And we’ve added impurity; useful programs use IO and thus become indeterministic (there are no mathematical guarantees a server on the other side of a planet will return the expected content).

functions are preferable to statements on the grounds of choosing a single word instead of a sentence (i.e, prefering some pre-vended structure over rolling your own). They are the same from a complexity viewpoint in other respects.

There are still some constraints that hold when using functions (ignoring generators and other oddities). Typed functions have known domains and codomains. We can see what variable names a function accesses and what subfunctions it calls. Functions are stateless; they hold no data (though they do have scope through which to see it). We can see what they return, and they operate under the rules of structured programming (though exceptions are, well, an exception).

function groupByRating(bands: [BandRating]) {
  return Object.GroupBy(bands, ({ rating: Rating }) => rating);
}

For some reason it was decided that bundling data with code was a good idea. So now we have classes, which give fewer guarantees. In particular, class instances have self-references on which they are encouraged to mutably set data. This means we need to check if each method uses this mutable state or are operating under normal function rules. Class-based designs build a web of interlinking instances each with their own little state objects, so now we move from a predominantly linear reading of code to needing to hold it all in our minds simultaneously.

class BandRating {
  band: string
  rating: Rating

  constructor(band: string, rating: Rating) {
    this.band = band;
    this.rating = rating;
  }
}

class BandRater {
  bands: BandRating[]
  grouped!: Record<Rating, BandRating[]>

  constructor(bands: BandRating[]) {
    this.bands = bands;
    this.grouped = Object.GroupBy(bands, ({ rating: Rating }) => rating);
  }
}

const rating = [
  new BandRating('Your Favourite Band', '⭐⭐⭐⭐⭐')
];
const rater = new BandRater(rating)
console.log(this.grouped);

With classes, we also start to creep into…

… sheer Bedlam; metaprogramming

Initialisers, postinitialisers, callables, property decorators that make methods look like values. Direct scope access and modification. Arbitrary code rewrites at run time. Screwing around directly on the call stack. Redefining the definition of a bracket (R language). Redefining class inheritance at will. Adding new and exciting control-flow operators. The options are endless, and constraints vanish.

Once metaprogramming is involved; you can only hope the author made light and contained use of it. Otherwise the semantics of your language are completely rewritten, and you no longer speak it.

Things fall apart; the centre cannot hold; Mere anarchy is loosed upon the world – W. B. Yeats_ on metaprogramming -

Restrict Ourselves Further

There are further restrictions we can place on ourselves to make programs legible:

Use frozen data types and constant value declarations to express we won’t modify state
Avoid reference-based updates across function boundaries, to express state is never modified by a distant actor
Provide the smallest possible inputs to a function, to express precisely what it depends on
Define higher-order functions to factor out base functionality, to minimise novel code

Lets have one name for one thing

That way, we always know what value a name is for. This principle is called referential transparency; it should be possible to replace all occurrences of a variable x with its value without altering program behaviour. Referentially transparent programs assure us that x now is no different from x ten lines down the program, which makes it much easier to read. Referential transparency has costs; it prevents us from appending to lists, setting keys in an object, or even incrementing integers in place. In practice functional programmers do these things a little more indirectly with function calls like:

def append(xs: list[T], x: T) -> list[T]:
  return xs + [x]

(generally, functional programmers do the same things in roundabout ways). There are performance implications to avoiding updating, reference, but a variety of purely functional data structures that perform well enough. One major upside is referentially transparent code is incapable of the spooky-action-at-a-distance effects that OOP can have, where some class four files over messes up your object in an innocuous looking method. With one-name, one-value, it’s clear how your value is “modified”; simply look at what a function takes in and what it returns.

Most languages have pass-by-reference semantics (or something equivalent) that allows updates by pointer or reference. We can still get the best of referential transparency by avoiding excess mutation by reference within a function, and prohibiting updates by reference across function or class boundaries. This preserves locality of edits, making it easier to trace how state changes in a program.

Functions should only do one thing We should split code into small functions and piece them together into programs. Each function is responsible for one logically atomic operation; it might make a HTTP request, it might process the request output, it might update a database column. It should not do all three things.

This has a few benefits:

Split to limit complexity

A small function can only go wrong in so many ways. This section is a bit more theoretic, and considers the set of functions with n variables in scope (either though lexical closure or directly). When a function has n variables in scope there are theoretically O(n^2) pairwise interactions.

k-complete graphs

What does this mean in practice? We often use variables in combination with others, and for larger functions we have to consider not just the n variables in scope but also their interactions with all other variables. The count of possible interactions scales non-linearly, so the same functionality chopped into smaller functions has fewer cross-interactions to check. This is one reason smaller functions are easy to understand; there’s less things that could happen. Conversely, one megafunction or script has the maximal number of potential interactions, since everything is in scope. Avoid this, if only for the sanity of code-reviewers; we’re already well aware that code legibility falls off exponentially with length.

Split for testing

We can more easily test code that’s divided neatly. Small functions have fewer inputs and smaller outputs, and less cyclomatic complexity we have to care about trying to cover. In some cases we can even test all possible cases for smaller functions (i.e ones that operate on inputs with finite ranges)

Split for reuse

We can reuse small functions, but not megafunctions.

Split to isolate IO contamination

One important subcase is splitting side-effectful, impure code (e.g file reads, network calls) from the processing of the retrieved data. This isolates where IO-exceptions and indeterminacy can creep into a program, and lets us test the IO-heavy code in isolation.

In Conclusion

Functional programming is not good because it lets us write less for-loops. It is good because FP is a language of constraints, and constraints, patterns, regularity, predictability in code allows us to break complex problems into a form that we can hope to understand. Writing dumb, constrained, simple code in the functional style lets us obey Kernighan’s Law —

Everyone knows that debugging is twice as hard as writing a program in the first place. So if you’re as clever as you can be when you write it, how will you ever debug it?