BVBbc*BVBdd->CBVBda->DBVBc_->E
Defining a Language
A programming language specifies rule-governed behaviour. You may know popular languages such as Python, C, or JavaScript. Here, I introduce a new programming language called AND.
AND is not a practical language for general-purpose computing. It is neither easy to read, nor easy to write, and limited in what it can express.
The language has two goals. First, its structure should resemble gene regulation—both how it works, and how it changes over time. Second, embedding it in simple models of single- and multi-cellular behaviour should let us explore what such a language can achieve.
Primarily, AND serves as a thinking tool—a means to reason about the role that DNA and its evolution can play in the bottom-up control of multicellular development.
The Structure of a Program
Here is an example AND program.
You might think this string of characters does not count as a “real” programming language. But many esoteric programming languages sport a syntax more bizarre than AND. Here is a “Hello world” program written in Brainfuck, a purposefully minimalist (and difficult to use) language.
++++++++[>++++[>++>+++>+++>+<<<
<-]>+>+>->>+[<]<-]>>.>---.+++++
++..+++.>>.<-.<.+++.------.----
----.>>+.>++.
This is the simplest, unadorned format. Adding spaces and syntax colouring clarifies the structure.
BVBbc*BVBdd->C BVBda->D BVBc_->E
Now we see three distinct expressions. Each expression encodes an update rule that modifies the program state. Think of the program state as working memory. In AND, this memory consists of a set of binary registers (containing 0 or 1) named from A to Z. Each update rule reads from, and writes to, these registers.
The Syntax of Expressions
Each expression consists of an activation condition and a target register separated by an arrow: ->. The activation condition has one or more binding terms, separated by a composition operator. A binding term starts with a three-letter binding operator, followed by two source registers.
| Name | Example(s) |
|---|---|
| Expression | BNBab*XZXcd->E |
| Activation Condition | BVBab*XZX_d |
| Binding Term | BVB_b |
| Binding Operator | BNB, UNU, XZX |
| Composition Operator | *, + |
| Source Register (read) | a, b, _ |
| Target Register (write) | A, B |
Upper and lower-case letters address the same registers. Lower-case letters read from a register; upper-case letters write to one. A special read-only null register, _, always contains the value 0.
So how do we evaluate an expression? Both the composition and binding operators encode Boolean operations. The composition operator is either * for logical AND, or + for logical OR. The binding operator is one of sixteen possible binary logical operations on two inputs (shown in Table 2).
| Op | Common name | Compact DNF |
|---|---|---|
| XZX | FALSE | ∅ |
| BNB | AND | αβ |
| BNU | α AND NOT β | α¬β |
| BNX | α | α |
| UNB | β AND NOT α | ¬αβ |
| XNB | β | β |
| QZQ | XOR | (α¬β) + (¬αβ) |
| BVB | OR | α + β |
| UNU | NOR | ¬α¬β |
| QNQ | EQ | (αβ) + (¬α¬β) |
| XNU | NOT β | ¬β |
| BVU | α OR NOT β | α + ¬β |
| UNX | NOT α | ¬α |
| UVB | NOT α OR β | ¬α + β |
| BZB | NAND | ¬α + ¬β |
| XNX | TRUE | ∞ |
BNBαβ. Both the command name, and the compact DNF form is shown (see below).
Translating Expressions
To see how these operators work together, consider the first expression in the program above: BVBbc*BVBdd->C. We can translate this into something more familiar by replacing the syntax with logical operators.
\[ \begin{array}[cccc] \enspace\texttt{BVBbc} & * & \texttt{BVBdd} & \texttt{->B} \\ \downarrow & \downarrow & \downarrow & \downarrow \\ (\mathtt{b} \lor \mathtt{c}) & \land & (\mathtt{d} \lor \mathtt{d}) & \Rightarrow \mathtt{C} \\ \end{array} \]
The translated expressions make it easier to see what each expression does. But they grow long and unwieldy in larger AND programs. Two further steps shorten them. First, we convert to Disjunctive Normal Form (DNF), gathering common factors into a series of ORed terms. Then we write the result in a compact syntax: juxtaposition for AND, + for OR, and no explicit logical operators. I call this compact DNF.
In the case above:
\[ (\mathtt{b} \lor \mathtt{c}) \land \mathtt{d} \Rightarrow \mathtt{B} \] becomes: \[ (\mathtt{b}\mathtt{d}) + (\mathtt{c}\mathtt{d}) \Rightarrow \mathtt{C} \]
Translating the other two expressions from the program, results in:
\[ \begin{align} (\mathtt{b} \mathtt{d}) + (\mathtt{c}\mathtt{d}) &\Rightarrow \mathtt{C}\\ \mathtt{a} + \mathtt{d} &\Rightarrow \mathtt{D}\\ \mathtt{c} &\Rightarrow \mathtt{E} \end{align} \]
Any AND program can be translated into a set of logical expressions like these. For the purpose of updating program state, the translated expressions behave identically to the original AND program. We can see how this works by running it.