Three Futamura Projections and more

Hello, habrachelovek. Today I will tell you about some fundamental things in computer science: partial computing, three Futamura projections and supercompilation.


1. Go to the code


-- функция, которая возводит x в степень y (неотрицательную)
power x y =
case y of
0 → 1
1 → x
_ → x * (power x (y - 1))



Suppose we use this function as follows:
This tiny example clearly shows the most important pattern in programming: abstraction.

As in this example, most of the time we use abstractions in special, special cases, while the unused part inevitably causes overhead in the calculation.

Think of your favorite web framework, for example. But what part of its code does a really useful job in your application?

This question is literally crazy for many inexperienced programmers, and they begin to engage in premature optimization, denial of abstractions, and other silly things.

But the use of abstractions does not have to entail overhead. Code using heavyweight omnipotent frameworks does not have to run slower than manually written ad hoc code.

Partial computing is a powerful technique that completely eliminates the overhead from abstractions. The vast majority of optimizations that significantly increase the speed of a program are special cases of partial computation of a program. A simple example:
Almost every modern compiler will be able to simplify (and simplify) this expression by rewriting it at compilation so that you do not need to calculate it again and again when you run the code:

var someMagicNumber = 4294967296


This is the simplest example of partial computing. This term is called so because during this optimization some part of the program is executed and the result is saved, while the other part remains unchanged.

Let's try to partially-calculate our first example with a square area. Imagine yourself in the place of the compiler. The compiler knows that the power function is always calculated in this place with the parameter y = 2. Therefore, it can be partially-calculated, in other words, specialized in this parameter.

Specialization, step 1. We substitute the constant 2 instead of the free parameter y.

-- функция, которая возводит x в степень y (неотрицательную)
power2 x =
case 2 of
0 → 1
1 → x
_ → x * (power x (2 - 1))


Step 2. Calculate the case-of, discarding the absurd branches 0 == 2 and 1 == 2:
Step 3. Calculate the expression (2 - 1):
Step 4. Substituting y = 1 into the power function, we get:
Surely you have already drawn an analogy with the inlining technique - function substitution familiar from C / C ++. These are also partial calculations.


2. Projections of Futamura



Professor Yoshihiko Futamura once fantasized about the interpretation of programs in the context of partial computing. I must say, it turned out quite amusing, if not to say that there is complete roof cover:
Let us have some magic function that does partial calculations (specializing programs):
For example, if you put the function function from the example above and the data “y = 2 in specialize »We get the power2 function. Pretty simple.

And there is an interpreter of a certain language, for example, a PHP interpreter written in assembler (ho-ho).

php_eval (php_code) : data


At the input there is a line with the php code, at the output is the result of the calculation. Nothing unusual either.

Attention, a question. Think about what will happen if we do this:
We mix the php interpreter in assembly language and the string with the php code. The result is an asm program that does the same thing as a php program. And this means that we compiled our php code into asm!

So, the first projection of Futamura : compilation is the specialization of the interpreter with the code of the interpreted program.

compiled_php_code = specialize (php_eval, php_code)


Funny, isn't it?

Further more. What will happen if we do:
We mix the assembler specialist and interpreter of the PCP. It turns out - a program to which any php code can be input and it will turn it into assembler code.

It turns out that having only a PHP interpreter and a magic specialist, we were able to give birth to a PHP compiler!

This is Futamura’s second projection : the compiler is the specialization of a specialist by interpreter code.

php_compiler (php_code) = (specialize (specialize, php_eval)) (php_code)


The head already starts to hurt a little, but what a result - the compiler was made from the interpreter! And that's not all ... Let's do it:
OMFG, what is it? This is a compiler generator. We give any interpreter to the input, we get the compiler at the output.

make_compiler (interpreter) = (specialize (specialize, specialize)) (interpreter)


This is the third projection of Futamura .


3. Supercompilation, meta-calculations



Partial calculations can reduce the code by substituting previously known data into the program. But there is an even more advanced technique that can optimize the code much more, and partial computation is a subset of it - supercompilation .

A compiler that mathematically simulates the execution of a program and then uses this model to produce a more efficient program is called a supervising compiler .

It is interesting that the authorship of this term (and related research) belongs to our compatriot - Valentin Turchin . He also developed the Refal language , which demonstrates the concept of supercompilation (the language is very old and based on rewriting terms).

Supercompilation is also called the "abstract interpretation" of programs. In Turchin, this is indicated by the term “driving” - or “driving” in English-language publications.

When running, the compiler builds a tree of program processes — a graph of all possible program states with edge transitions between them.

The compiler, as it were, is trying to comprehend the process of the program. I will try to illustrate it with a simple example:
Do not find anything strange in this code? For any value of X, the algorithm will never reach either xyzzy nor plugh . Mentally scroll through the algorithm and you will see this. Think about how you came to this conclusion - after all, the supercompiler works in exactly the same way.

We build a process tree:
Partially-compute case-of branches:
Therefore:
There were attempts to implement this technique for Java:
www.supercompilers.com/white_paper.shtml
And, recently, for Haskell (Supero):
www-users.cs.york.ac .uk / ~ ndm / supero /

Probably, some readers have already imagined a certain compiler, which, using partial computation and supercompilation, completely optimizes any program. But to solve this problem is impossible - it belongs to the class of unsolvable problems .

In reality, the process tree of a program becomes infinite or too large for intelligent processing. This happens when trying to analyze even small programs, let alone large commercial applications with tens of millions of lines of code.

There are various techniques to work around this problem: generalization - generalization of infinite trees, convolution (folding, code sharing) - a generalization of the same branches of the calculation.

There are also many problem areas in the implementation of these techniques - for example, it is not always easy to determine when to generalize and when not to. As the saying goes, the devil is in the details.


4. Programming by inversion



Supercompilation of programs gives interesting side effects - it can be used to solve logical programming problems (remember Prolog ?), Prove theorems and invert calculations .

Having a lot of inputs and outputs of a function and an abstract calculation graph (a process tree), in some special cases it is possible to invert the calculation.

Suppose we have the following task: to find in the string “abcd” all substrings 2 characters long.

У нас есть функция (strstr a b), возвращающая true если b содержит a. Then we could write down the solution by applying the inversion as follows:
Immediately there are associations with the pattern matching technique in functional programming languages. Actually, this is the case - inversion of algorithms is the key to abstract pattern matching.


Instead of a conclusion



Here is such an article, haberman. I hope she helped you look at a different angle on the interpretation, optimization and compilation of programs.

P.S. Learn You a Haskell for Great Good!