In Trusting Trust paper, Ken Thompson touched several ideas that made me think.
STAGE I
…
to write a source program that, when compiled and executed, will produce as output an exact copy of its source. If you have never done this, I urge you to try it on your own. The discovery of how to do it is a revelation that far surpasses any benefit obtained by being told how to do it.
This blog post explores one of the ideas from the paper, specifically, the the idea of self-reproducing program, aka Quine:
A quine is a computer program which takes no input and produces a copy of its own source code as its only output. The standard terms for these programs in the computability theory and computer science literature are “self-replicating programs”, “self-reproducing programs”, and “self-copying programs”.
So, let’s develop a self-replicating program aka quine.
The problem
The problem is evident: self-reference.
Given the simplest ruby program that prints nothing:
printf
to print its content the program has to change its content(yes self-referencing) add print it:
printf 'printf'
The paradox: as result of the change, the output no longer matches the content. This may continue indefinitely.
There’s must be an other way.
Divide and Conquer
Let’s split program into content part and output part:
p='printf'
printf p
Now we can adjust each part separately to match the other one, and iterate until both match.
Let’s compare content(of the program saved to quine.rb
) and its output with diff
:
$ diff quine.rb <(ruby quine.rb)
1,2c1
< p='printf'
< printf p
---
> printf
\ No newline at end of file
It’s far off:
- the content is 2 lines, but
- the output is just a
printf
By taking advantage of the content variable p
and printing it twice:
p='printf'
printf "%s\n%s",p,p
we can get closer to our goal:
$ diff quine.rb <(ruby quine.rb)
1,2c1,2
< p='printf'
< printf "%s\n%s",p,p
---
> printf
> printf
\ No newline at end of file
- to get rid of the
\ No newline at end of file
let’s add the"\n"
- to display missing
p=
- let’s change the format to include it into the output
p='printf'
printf "p=%s\n%s\n",p,p
yields the diff:
$ diff quine.rb <(ruby quine.rb)
1,2c1,2
< p='printf'
< printf "p=%s\n%s\n",p,p
---
> p=printf
> printf
Getting closer. Now we need to change the output p
to match the content:
p='printf "p=%s\n%s\n",p,p'
printf "p=%s\n%s\n",p,p
yields:
diff quine.rb <(ruby quine.rb)
1c1
< p='printf "p=%s\n%s\n",p,p'
---
> p=printf "p=%s\n%s\n",p,p
Almost there, what we’re missing are the quotes '
.
Naively, we may proceed to adding quotes to the printf
output and then modifying the content p
:
p='printf "p='%s'\n%s\n",p,p'
printf "p='%s'\n%s\n",p,p
which isn’t going to work since the quotation is now broken.
This is another self-reference problem, since quotes cannot be included into quoted string as the quotes need be escaped, which in turn leads to changing the output.
Instead we need to use something else than literal quotes, either unicode literals:
p='printf "p=\u0027%s\u0027\n%s\n",p,p'
printf "p=\u0027%s\u0027\n%s\n",p,p
or like this:
p='printf "p=%s\n%s\n",0x27.chr+p+0x27.chr,p'
printf "p=%s\n%s\n",0x27.chr+p+0x27.chr,p
yields:
$ diff -s quine.rb <(ruby quine.rb)
Files quine.rb and /dev/fd/63 are identical
Success: the content of the quine.rb
and its output is identical.
Summary
Initially, upon reading about the challenge, I thought there’s some magical way to self-reference the code, which is normally isn’t the case and programmer has to almost duplicate the program in order to have self-reference. The trick is to break the self-reference cycle and work around it with variables and alternative quote representations(which can be viewed as as lowering abstraction level from syntax to underlying byte representation)
PS: Go and embed
draft
It’ll be much simpler to write self-replicating programs with Go once the embed draft is accepted.
Here’s an example:
// quine.go
package main
import "embed"
//go:embed quine.go
var src embed.Files
func main() {
data, _ = src.ReadFile("quine.go")
print(string(data))
}
Once compiled, the source of the quine.go
is embedded into src
which the program may then self-reference by name.
Watch Russ Cox introduces embed draft for more details.
Go quine without embed
-ing
for the record.
package main
import "fmt"
func main() {
const q = string(rune(96))
fmt.Printf("%s%s", self, q+self+q+"\n")
}
var self = `package main
import "fmt"
func main() {
const q = string(rune(96))
fmt.Printf("%s%s", self, q+self+q+"\n")
}
var self = `