BF is Turing-complete

Alan Turing came up with the idea for the so called Turing Machine, a very simplistic computing model, which yet is powerfull enough to calculate all possible function which can be calculated. A Turing Machine consists of an endless tape of cells, which each can contain any of a given set of symbols, a pointer pointing to one of the cells, and a certain finite state control. Depending on the function to be calculated the finite state control determines how the tape is manipulated. It does this by reading the symbol under the pointer, and decides what to do: write a new symbol on the current cell, move the tape to the left, or to the right, and what will be the next state.

Each turing machine can be specified by the five elements:

A finite set of symbols A, also called the alphabet.
An initial symbol a_init, which each cell contains on initialization.
A finite set of states S.
An initial state s_init.
A final state s_final.
A function f of state and symbols onto a tupple consisting of a new state, a new symbol, and a movement. The movement can be one of right and left.

Many interesting properties of Turing Machines have been proven so far. For example, that each Turing Machine can be transformed into a Turing Machine with a tape that is cut-off on one side, and endless on the other side. Or that every Turing machine can be transformed into a Turing machine with set of symbols contains just two symbols.

A language is said to be 'Turing-complete', if for each functions that can be calculated with a Turing Machine, it can be shown that there is a program in this language that performs the same function. There are basically three approaches to proof that a language is Turing-complete. These are:

Show there is some mapping from each possible Turing machine to a program in the language.
Show that there is a program in the language that emulates a Universal Turing Machine.
Show that the language is equivalent with (or a superset of) a language that is known to be Turing-complete.

On the rest of the page, proofs of each of the above approaches will be given.

Translating Turing Machines to BF programs

We can proof that BF is Turing-complete, if we can show for every possible Turing machine, there is an equivalent BF program. Of course, we only need to show that a Turing Machine with a tape limited on one side can be simulated. What we need is an algorithm, which given a Turing machine, specified by the six elements given above, generates a BF program, which will simulate the Turing machine.

To achieve this, the set of symbols A is mapped onto the numbers 0 to n-1, where n equals the number of symbols, such that a_init maps onto zero. Likewise, the set of states S is mapped onto the numbers 0 to m-1, where m equals the number of states, such that s_init maps onto zero, and s_final maps onto 1.

To simulate a Turing machine, we need an array with unlimited length to represent the tape, a index into that array representing the position of the head, and the current state. To program this in BF we will need to assign these to some memory locations. The position of the head will be stored in memory location head, and the current state in the memory location state. Of course, some additional temporary memory locations will be needed. Using the terminology introduced on the page An introduction to programming in BF, we need a program of the following form:

IsOne(state,t1) Not(t1,t2,t3) while(t2)

getarray(base,1,0,head,cursymbol)

(some code which given the state and the cursymbol
assigns a value to newstate, newsymbol, and newhead)

setarray(base,1,0,newhead,newsymbol) zero(cursymbol) zero(newsymbol) zero(head) move(newhead,head,t1) zero(state) move(newstate,state,t1)

IsOne(state,t1) Not(t1,t2,t3)

wend(t2)

Now we only need to fill in the middle part of the loop. For each tupple in f we need to generate some code. This means that for each symbol i in the range 0 to n-1, and each state j in the range 0 to m-1, we have to insert a line:

   map(state,head,cursymbol,
       i,j,k,l,m,
       newstate,newhead,newsymbol)

Where, given the tupple f(s_i,a_j), k is the number representing the new state, l is the number representing the new symbol, and m is equal to 1 if the movement is right and equal to 0 if the movement is left. The procedure map is defined as follows:

  map(state,head,cursymbol,
      astate,asymbol,anewstate,anewsymbol,movement,
      newstate,newhead,newsymbol)
  = 
    copy(state,t1,t2)
    const(t2,astate)
    Equal(t1,t2,t3,t4,t5)
    if(t3)
      copy(cursymbol,t1,t2)
      const(t2,asymbol)
      Equal(t1,t2,t3,t4,t5)
      if(t3)
        const(newstate,anewstate)
        const(newsymbol,anewsymbol)
        copy(head,newhead,t1)
        ifelse(movement,t1)
          inc(newhead)
        else(movement,t1)
          dec(newhead)
        endelse(t1)
      endif(t3)
    endif(t3)

In this code the expression const(v,c) stands for the piece of code that fills the memory location v with the value c. Now, we only need to assign real values to the variable representing the memory locations to arrive at the final BF program executing the given Turing machine. Note that memory location assigned to base should be larger than all other memory locations used.

Universal Turing Machine

The text of this section is based on
this fully documented BF program
written by Daniel B. Cristofani.

A Universal Turing Machine (UTM) is a Turing machine that can simulate some Turing-complete computational model. By giving a BF program which simulates a particular UTM, we proof that BF is Turing-complete. The UTM that we implement here is taken from from Yurii Rogozhin's article Small universal Turing machines, in Theoretical Computer Science, November 1996 (volume 168 pgs 215-240). This UTM simulate a Turing-complete class of tag-systems. A tag-system transforms strings over an alphabet A = {a[1],a[2],...,a[n], a[n+1]} as follows: a positive integer m is chosen, and so is a function P that maps each a[i] for 1<=i<=n to a string P(a[i]) over the alphabet A. Now:

if the string being transformed has fewer than m elements, the whole process stops now.
m elements are removed from the beginning of the string
call the first element removed a[k]; if k=n+1 the whole process stops now.
P(a[k]) is appended to the string.
steps 1-5 are repeated.

The particular class of tag-systems this Turing machine simulates is the class where m = 2, the initial string has length at least 2, and all P(a[i]) where 1<=i<=n are of the form a[n]a[n]B[i] where B[i] is some string over the alphabet A (B[i] is the empty string if and only if i=n).

The input for this Turing machine is mildly complex, and there is no error checking.

The representation of a symbol a[i] from the alphabet A is a string of 1s which is one element longer than twice the combined length of all P(a[j]) where 1<=j<i.
a value like P(a[i]) = a[n]a[n]a[w]a[x]...a[y]a[z] is represented as follows:
The function P is represented by listing all its outputs in the order P(a[n]), P(a[n-1]),...,P(a[2]),P(a[1]). The representation of P(a[n+1])=STOP is done for you.
The initial string a[q]a[r]...a[s]a[t] to be transformed by the tag-system is represented as
The input to this program is a function P as described above, then a single b, then the initial string to be transformed. Run all the 1s, bs, and cs together in one line with nothing between, followed by a return.
The output format, if the program terminates, is the same as the input format for the initial string, and represents the final state of the transformed string, with a final a[n+1] appended due to a peculiarity of the Turing machine's algorithm.

An example input string

We take as an example the tag-system over the alphabet A = {a[1], a[2], a[3], a[4]}, where m = 2 and:
P(a[1]) = a[3]a[3]a[2]a[1]a[4]
P(a[2]) = a[3]a[3]a[1]
P(a[3]) = a[3]a[3]
P(a[4]) = STOP

It meets the criteria above; and when applied to the initial string a[2]a[1]a[1] it gives:

a[2]a[1]a[1]
        a[1]a[3]a[3]a[1]
                a[3]a[1]a[3]a[3]a[2]a[1]a[4]
                        a[3]a[3]a[2]a[1]a[4]a[3]a[3]
                                a[2]a[1]a[4]a[3]a[3]a[3]a[3]
                                        a[4]a[3]a[3]a[3]a[3]a[3]a[3]a[1]
                                                a[3]a[3]a[3]a[3]a[3]a[1]

and then it's done.

Now, the encoding:

a[1] is 1
a[2] is 11111111111
a[3] is 11111111111111111
a[4] is 111111111111111111111

P(a[1]) is b1 b111111111111111111111b b1b b11111111111b b11111111111111111b b1111111111111111
P(a[2]) is b1 b1b b11111111111111111b b111111
P(a[3]) is b1 b11111111111111111b b

the initial string is 11111111111c1c1

and so the whole input is

b1 b11111111111111111b b
b1 b1b b11111111111111111b b111111
b1 b111111111111111111111b b1b b11111111111b b11111111111111111b b1111111111111111
b
11111111111c1c1

which when run together for input to the Turing machine becomes

b1b11111111111111111bbb1b1bb11111111111111111bb111111b1b111111111111111111111bb1bb11111111111bb11111111111111111bb1111111111111111b11111111111c1c1

The Universal Turing Machine

The Universal Turing Machine (UTM) implemented by the BF program is a four state, six symbol (namely: '1','b','<','>','0', and 'c') Turing machine. For those interested, the state table of the machine is given in the following table. Each cell contains a 5-tupple representing the current state, the symbol read, the symbol written, the action (L stands for move left, R stands for move right and H stands for halt), and the new state.

11<L1 210R2 311R3 410R4

1b>R1 2b>L3 3b<R4 4bcL2

1>bL1 2><R2 3>bR3 4><R4

1<0R1 2<>L2 3< H 4<

10<L1 201L2 30cR1 40cL2

1c0R4 2cbR2 3c1R1 4cbR4

The initial state is 1, tape cells are set as per the input but with the termination code P(a[n+1])=STOP represented as a <b at the left end and all other cells on both sides (actually, only the right is needed) holding symbol 0, and the head is initially at the first 1 in the code for the initial string.

The minimal test case b1b1bbb1c1c11111 represents the tag-system where P(a[1]) = a[1]a[1] and P(a[2]) = STOP, applied to the string a[1]a[1]a[2]. This runs for 518 steps of the Turing machine, exercising all 23 Turing machine instructions, before halting with the output string a[1].

BF program

Below follows the BF program which implements the above Universal Turing Machine. (For a detailed explaination of this program, read the original source file by Daniel Cristofani.)

+++>++>>>+[>>,[>+++++<[[->]<<]<[>]>]>-[<<+++++>>-[<<---->>-[->]<]]
<[<-<[<]+<+[>]<<+>->>>]<]<[<]>[-[>++++++<-]>[<+>-]+<<<+++>+>
  [-
    [<<+>->-
      [<<[-]>>-
        [<<++>+>-
          [<<-->->>+++<-
            [<<+>+>>--<-
              [<<->->-
                [<<++++>+>>+<-
                  [>-<-
                    [<<->->-
                      [<<->>-
                        [<<+++>>>-<-
                          [<<---->>>++<-
                            [<<++>>>+<-
                              [>[-]<-
                                [<<->>>+++<-
                                  [<<->>>--<-
                                    [<<++++>+>>+<-
                                      [<<[-]>->>++<-
                                        [<<+++++>+>>--<-
                                          [<->>++<
                                            [<<->>-
]]]]]]]]]]]]]]]]]]]]]]<[->>[<<+>>-]<<<[>>>+<<<-]<[>>>+<<<-]]>>]
>[-[---[-<]]>]>[+++[<+++++>--]>]+<++[[>+++++<-]<]>>[-.>]

URM is BF-complete

URM stands for Universal Register Machine. The statement "URM is BF-complete", simply means that all possible URM programs have an equivalent BF program, if each memory cell may contain an arbitrary large number. Actually, the set of possible BF programs is much larger than the set of possible URM programs. You would hardly believe it, but URM is also Turing-complete, due to the fact that the registers can store infinite numbers, although the mapping from general Turing machines to URM is very, very complicated. Based on the ideas for a UTM in BF, Daniel Cristofani wrote a UTM for URM using only five registers.) These fact combined, proof that BF is Turing-complete.

The Universal Register Machine is a machine with a fixed number of characters, and it supports the following commands:

an : increment register n
sn : decrement register n
x; y : execute command x and then y
(x)n : execute command x while register n is nonzero
. : ("dot blank") halt the machine.

Examples URM programs are:

Add register 3 to register 2:
```
   (a2;s3)3.
```

Multiply register 2 with register 3:

   (a4;a5;s2)2; ((a2;s4)4; s3; (a1;a4;s5)5; (a5;s1)1)3.

There is almost a one-to-one mapping from URM to BF. The an expression maps to +, the sn expression maps to -, and the (x)n expression maps to [x], of course preceded with the right amouth of > and < to move to the indicated memory location. The translation of the above two URM programs to BF are:

Add register 3 to register 2:
```
   >>>[<+>-]<<<
```

Multiply register 2 with register 3:

   >>[>>+>+<<<-]>[>[<<+>>-]<->>[<<<<+>>>+>-]<<<<[>>>>+<<<<-]>>]<<<

It is possible to make context free grammars for both BF and URM, such that matching parse trees indicate equivalent programs. Such a grammar is given below:

URM	BF
`root = S₁.`	`root = S₁`
`S_n = an`	`S_n = +`
`S_n = sn`	`S_n = -`
`S_n = S_n ; S_n`	`S_n = S_n S_n`
`S_n = ( S_n )n`	`S_n = [ S_n ]`
`S_n = S_n+1`	`S_n = > S_n+1 <`
`S_n = S_n-1`	`S_n = < S_n-1 >`

Using this mapping, one can create another UTM in BF from the above mentioned UTM in URM.

Still, I feel that BF is far more elegant than URM, because it gives you more with less symbols. BF only needs the symbols "+", "-", "<", ">", "[", and "]", whereas URM needs "a", "s", ";", "(", ")", ".", and "0" to "9". Although, I have to admit, that BF programs will in general be longer than the equivalent URM program, and in most cases also more cryptic.

home and email

11<L1	210R2	311R3	410R4
1b>R1	2b>L3	3b<R4	4bcL2
1>bL1	2><R2	3>bR3	4><R4
1<0R1	2<>L2	3< H	4<
10<L1	201L2	30cR1	40cL2
1c0R4	2cbR2	3c1R1	4cbR4