Stringly-Typed Code

As part of my house-finding project Castle I wrote the following substantial cypher query to find the distance from a property to work by train.

await this.db.run(`
match (p:Property)-[:NEAR]-(d0:DartStop), (w:Workplace)-[:NEAR]-(d1:DartStop)
  where p.id = $id
with *
match path=(d0)-[:TRAVELS_TO*]-(d1)
with *

with reduce(acc={ sum: 0, last: null }, node in nodes(path) | case
  when acc.last is null
    then {
      sum: 0,
      first: node
      last: node
    }
  else
    {
      sum: acc.sum + distance(acc.last.location, node.location),
      first: acc.first,
      last: node
    }
  end
) as acc, p, d0, d1, w

return {
  totalDistance: acc.sum + distance(p.location, d0.location) + distance(d1.location, w.location),
  dartDistance: acc.sum,
  lastName: acc.last.name,
  firstName: acc.first.name,
  walk: distance(p.location, d0.location) + distance(d1.location, w.location)
} as result
`, { id: property.id })

That’s not exactly true; I added a comma after first: node to fix the syntax. Did you spot it? I don’t like string-based query languages because they’re hard to read (lack of highlighting), there’s no compile-time syntax checks, and dynamic string0query construction often leads to SQL injection

exploits of a mom xkcd. Jokes about a son with a name that would drop SQL tables in poorly sanitised databases

I think we can do better. First, come up with an machine-readable representation of the DSL.

[
  [
    "match",
    [
      "pattern",
      [
        "cartesian",
        [
          "relation",
          "NEAR",
          ["node", "p", "Property"],
          ["node", "d0", "DartStop"]
        ],
        [
          "relation",
          "NEAR",
          ["node", "w", "Workplace"],
          ["node", "d1", "DartStop"]
        ]
      ]
    ]
  ],
  [
    "where", ["eq", "p.id", "$id"]
  ],
  [
    "with", "*"
  ],
  // - much more
]

Second, write libraries that output this library

db.match([
  node('p:Property').rel('near').node('d0:DartStop'),
  node('w:Workplace').rel('near'),node('d0:DartStop')
])
.where('p.id', 'eq', id)

And now use this library to build any future queries. Why take this approach?

In my opinion we should never have to write SQL, Cypher, or any other string-DSL alongside our normal programming language.

This point can be generalised to arguing for homoiconic language representation like LISP so code is always (easily) machine-readable and writeable, but I won’t go into that point here

Takeaway Points: