By dkl9, written 2025-218, revised 2025-220 (1 revisions)
My maths game Functle uses postfix notation. Write each formula with operators after their inputs: "1 2 +" instead of 1 + 2. The discriminant b² - 4 a c is now "b 2 ^ 4 a c * * -", and Einstein's famous formula becomes "E m c 2 ^ * =".
This contrasts with infix, on which most maths notation is based. There, operators go between their operands, as in "b ^ 2 - 4 * a * c" or "E = m * c ^ 2".
My acquaintance E.D.S. dislikes postfix enough to pester me about its use in Functle. I owe him a detailed answer. I didn't have a good answer myself until recently.
But first, I must give due respect to the third option some readers will notice.
The natural third way is prefix. Put each operator before what it acts upon. This makes things like "- ^ b 2 * * 4 a c" and "= E * m ^ c 2".
Prefix is harder than postfix, so I prefer to ignore it. Prefix is just postfix backwards, so I mostly can ignore it.
Why "mostly"?
Reverse the postfix "b 2 ^ 4 a c * * -". You get "- * * c a 4 ^ 2 b". The "* * c a 4" part matches the "* * 4 a c" in the proper prefix form. Issues come only from the "^" and "-", sith they don't commute.
Prefix is just reversed postfix, with each operator applied in reverse order.
The obvious reason to prefer postfix is that it's easier for computers to parse and use. This is why postfix was used on some old calculators, where it was called "reverse Polish notation". An entire niche programming language, Forth, was built around postfix.
The less obvious reason, which I knew only vaguely as I made Functle, is that postfix is more concise than infix. This is especially important for Functle, in which expressions are only five symbols long.
You can parse a string of postfix notation in a single pass.
You can just as well evaluate a postfix formula in one pass, if you replace "sub-expression" with "result", "build" with "evaluate", and "parsed expressions" with "final values". That's what Functle does, and the simplicity of it makes the whole parse-evaluate step take just 25 lines of code. Explore how that works below.
Formula:
Stack:
To parse or evaluate infix in one pass is much trickier.
How much could you express with five tokens of infix? Say each token is an operand ("t", like 3), a unary operator ("u", like ln), or a binary operator ("b", like +). Say unary operators go before their operands. How many syntactically-distinct expressions can you make?
| Form | Example |
|---|---|
| uuuut | ln sin exp sqrt x |
| uutbt | exp sin x + 1 |
| utbut | exp x * sin x |
| tbuut | 2 * sqrt cos x |
| tbtbt | 2 * x + 3 |
And what about in postfix?
| Form | Example |
|---|---|
| tuuuu | x sqrt exp sin ln |
| ttbuu | x 2 * cos exp |
| tutbu | x sqrt x + ln |
| ttubu | x x sqrt - ln |
| ttuub | 1 x sin sin + |
| tutub | x exp x sqrt / |
| tuutb | x cos exp 2 / |
| tttbb | 1 x 1 + / |
| ttbtb | 1 x / 1 + |
At just five tokens, there are four postfix arrangements inexpressible in infix. The options new to postfix are marked with bold above.
"But what about parentheses?" you may object. Infix does make good use of parentheses, but they are tokens like any other. With five tokens, all you can do with parentheses is "(tbt)" or "u(ut)", which are useless.
If five tokens are too small a sample, take a more general argument. Postfix and infix generally take the same length to express a formula. But infix sometimes needs parentheses to work around the order of operations. Postfix doesn't use parentheses. Thus postfix strings are at most as long as equivalent infix, and so express more at equal length.
In general, the number of distinct infix expressions on n tokens is a Fibonacci number, while distinct postfix expressions follow the Motzkin numbers. I leave the proofs as exercises for the reader.
By now, postfix form should seem a compelling alternative. But even I, who likes to break convention, still don't use postfix for much, even privately. Yet.
Why is infix so compelling?
In postfix, the size of any sub-expression is defined from within by its structure. In "E m c 2 ^ * =", you can know, just by looking ahead, that "m" starts a sub-expression that ends at "*". Thus postfix gets to cram all operands of an operator before the operator. But to know which token goes in which sub-expression, you must read every token most of the way from the operator to the start of its first operand.
Infix makes it easier to tell operands apart: the operator sits between them. To see that "E" is one side of the equation and m-stuff is the other is much more direct from "E = m * c ^ 2" than from "E m c 2 ^ * =".
Computers must read the whole expression to use it, whatever the syntax. Humans struggle to read token-by-token, but can grasp infix better by jumping around. These shortcuts make infix the natural preference.
Are there other ways beyond infix, postfix, and prefix? Let's see what they have in common, and try to change from that.
A formula in any of these three forms naturally parses to a syntax tree. Each operator becomes a branch node, and each basic operand becomes a leaf. E.g. prefix says "- ^ b 2 * * 4 a c", infix says "b ^ 2 - 4 * a * c", and postfix says "b 2 ^ 4 a c * * -", but they all describe the same tree:
It turns out that any shape to show a string's syntax in terms of tokens must be a tree:
You can parse something that's connected more elaborately than a line, violating step 1. But that wouldn't be a string in the usual sense, so that's not just syntax, but graph theory.
You can use some symbols to decide where to add extra edges, violating step 3. But if you keep those symbols in the shape, the new edges are redundant, and if you take out the symbols, that's not just syntax, but also semantics.
You can arrange the tokens into a disconnected graph with cycles, violating step 4. But that would connote the tokens in different connected components are unrelated, so that's not just syntax, but classification.
When you accept syntax as trees, prefix, infix, and postfix are the only fully natural ways to arrange the string. You could make a different way, but it would be domain-specific and somewhat contrived.