Notes on ambiguity and its effects in spoken and programming languages.
This post touches on:
- syntactic ambiguity (and disambiguation) in spoken language(SL)
- syntactic ambiguity (and disambiguation) in programming languages(PL)
- deliberate ambiguity and its effects
PS: i’m not a linguist and all my findings are empirical so feel free to point out my mistakes…
Ambiguity
Let’s define ‘ambiguity’:
the quality of being open to more than one interpretation.
Ambiguity: in spoken languages
There are many kinds ambiguity but this post focuses on structural ambiguity(aka syntactic ambiguity):
Here’s an example:
he gave her cat food.
The above sentence could be interpreted in at least 2 ways:
- he gave her (cat food)
- he gave (her cat) food
(parenthesis are used to disambiguate the structure, similarly to arithmetic expressions)
English language provides syntactic means to disambiguate, so the sentence may become:
- he gave her cat-food
- he gave, her cat, food
or restructure:
- he gave food to her cat
- he fed her cat
More on spoken language ambiguity
We’ll never have completely unambiguous language because attempts to fix ambiguity result in more ambiguity:
- Why you need to be using Oxford commas
- How Oxford comma is creating ambiguity
- Court Rules the Oxford Comma Necessary
And likely ambiguity is here to stay: ambiguity is a good thing.
Ambiguity in PL syntax
To warm up, what’s the value of the expression:
1 or 2 and 3
It’ll likely to take a second to disambiguate the expression before evaluation:
- is it
(1 or 2) and 3
, or 1 or (2 and 3)
- and i’ve not specified the language…
Syntax ambiguity worsens readability and understanding since it requires more cognitive effort to decipher the intent.
Disambiguation in PL syntax
Disambiguation is done with special rules encoded in the grammar. Here’s Ruby grammar snippet.
%left keyword_or keyword_and
...
%right keyword_not
...
%left tOROP
%left tANDOP
There are 2 main rules:
- Precedence: the order operators are evaluated: what oparations precede others
- Associativity: the order operator applies(associates) to arguments with the same precedence (to not confuse with Math’s associativity)
So given the rules:
or
iskeyword_or
and
iskeyword_and
- they’re same precedence and
%left
associative
the expression 1 or 2 and 3
is disambiguated as (1 or 2) and 3
which yields 3
.
— gmarik (@gmarik) November 3, 2019
Another example(that looks very similar):
1 || 2 && 3
Given the rules:
||
istOROP
&&
istANDOP
tANDOP
has higher precedence thantOROP
- they’re both
%left
associative
the expression is disambiguated as 1 || (2 && 3)
which yields 1
.
The main issue with disambiguation rules is that they’re implicit and not a part of the syntax being read; which makes the syntax easy misinterpret.
Deliberate ambiguity in PL
Yes, PL designers choose to have ambiguous grammar for sake of “readability”; which, in this case, stands for “reduced syntax” and doesn’t mean non-ambiguous or exact.
Some examples of ambiguity:
+
is an unary and binary operator()
- are used for scope delimitation, grouping, function invocation{}
can be block scope or a value literal
There are non-ambiguous syntaxes but, for some reason, they’re not in favour:
- lisps
- TODO: what else?
Cost of deliberate ambiguity in PL
One of the consequences of the syntax-ambiguity is the infinite ways the same syntax may be structured:
Go completely removes this problem with gofmt
.
Ambiguity is expensive: how many people-hours are spent on keeping the consistency? And all the yak-shaving? Tabs vs spaces? OMG…
#gofmt is not _just_ formatting it's #disambiguation.
— gmarik (@gmarik) May 24, 2019
For example`=+` easily confusable with `+=` but "formatting" fixes it #GoLang
Playground https://t.co/X4JxRiQQII
Ambiguity vs Determinism elsewhere
- Grammars: Yacc is dead vs Yacc is alive
- Concurrency: Happens before
- TODO
Summary
So my personal take on language ambiguity:
- notice ambiguity, don’t assume.
- disambiguate by asking questions or being exact with syntax.
- exactness improves readability and understanding.
- avoid ambiguity in domains where its effects are not desired(computing).
- ambiguity is a ‘chaos-generator’ that exercises our anti-fragility.