The Art of Programming

Programming is sometimes called an art. Lately, I have been wondering whether it is an art, or a technique. For example, what do the following two programs in common?

void main()
{   int n500 = 0, n250 = 0, n100 = 0, n25 = 0,
        n10 = 0, n5 = 0, a = 500, m = 0;

    while (a >= 0)
    { while (a >= 0)
      { while (a >= 0)
        { while (a >= 0)
          { while (a >= 0)
            { m++;
              n10++; a -= 10;
            }
            a += 10 * n10; n10 = 0; n25++; a -= 25;
          }
          a += 25 * n25; n25 = 0; n100++; a -= 100;
        }
        a += 100 * n100; n100 = 0; n250++; a -= 250;
      }
      a += 250 * n250; n250 = 0; n500++; a -= 500;
    }
  printf("Number of combinations is %d\n", m);
}

void main()
{   int m[100][5];
    int i,j;
    int v[] = { 2, 5, 20, 50, 100 };

    for (i = 0; i < 100; i++)
        for (j = 0; j < 5; j++)
            m[i][j] = 0;

    for(i = 0; i <=500; i++)
        for (j = 0; j < 5; j++)
            m[i%100][j] =   (j == 0 ? 1 : m[i%100][j-1])
                          + m[(i + 100 - v[j])%100][j];

    printf("Number of combinations is %d\n", m[500%100][4]);
}

The answer is that these programs both print the value of c(500, {5, 10, 25, 100, 250, 500}), where the function c is defined (using some common math notations) as follows:

   c(a : nat, V : P nat)
   = count({ { (v_1, c_1), .. , (v_n, c_n) } : P (nat, nat)
           |     v_1 * c_1 + .. + v_n * c_n = a
             and {v_1, .. , v_n} = V} and |V| = n)

Or to say it in words: In how many ways can I pay 500 if I do have sufficient amouth of coins with the values 5, 10, 25, 100, 250, and 500.

This shows that two vastly different programs calculate exactly the same thing. The question, which of the two is the best, is not easy to answer. If you only need to calculate this question once, it does not matter which one you use. But what if there are restrictions to the number of coins you have, or if you have to do it for other amounth as well. Or what if you have a different set of coins.

Specification versus Implementation

Above, I gave two implementations of a specification. Writing a program is nothing else then implementing a specification. Usually, the specification is given in prose, and not in a mathematical formulae. The art of specifying is to give a description which exactly states the problem, without saying anything too much or too less.

While implementing a specification, one makes the following kind of choices related to:

Generic versus specific solutions.
Order of calculation.
Algorithms to be used.
Which intermediate results to store.
Representation of the data.

Note, that there is not a strict order in making these kind of choices. A decision related to data storage, can often influence a choice with respect to which algorithm is the best.

Implementating versus implementation langauges

An implementation langauge is a language in which a particular implementation of specification is written. It is a language to write down the result of the implementation process. The C language which I used to write my two programs is an example of a implementation langauge. Please note that the concepts `specification language' and `implementation language' are relative notions. A certain language can both be used as a specification language in one context, and as a implementation language in another.

An implementating languages, is a language which expresses how a certain specification is implemented. Such a language would describe how a given specification is transformed into a program. An implementation language operates on expression representing data and algorithms. Because the transformations performed on the specification can be rather complex, such a language should have very powerful features to describe these transformations.

First implementation of the problem

Lets look at the implementation transformations which were applied to result in the first program. Below we present a number of transformation steps, each using a specific technique.

Mapping sets to lists

A commonly used technique in implementation is mapping sets to lists. There are several reasons for doing this. One is that lists are closer to concepts used in imperative languages, but another, maybe more important reason, is that some algorithmes make use of an aselective select operation on sets. (Not having an `aselect select' operation can lead to very inefficient algorithms.) If we map the sets to lists for our problem, we get:

   c(a : nat, V : L nat)
   = count({ S : L nat
           |     count(S) = count(V)
             and (sum v * s for (v,s) in V.S) = a
           }

A problem with this specification is that the expression V.S, where `.' stands for the structural inproduct operation, is ill-defined if V and S are of different lenght. Lets replace the list with arrays:

   c(a : nat, V : array[n] nat)
   = count({ S : array[n] nat
           | (sum V[i] * S[i] for i in { 1 .. n }) = a
           }

Removing predicate set expressions

The first step in the analysis, is to remark, that there are certain conditions on the values in the array S, as a result that it consists of natural numbers. based on the rule that x + y >= x if y : nat, we can conclude that V[i] * S[i] <= a for each i. We can also conclude that (sum V[i] * S[i] for i in { 1 .. j }) <= a for each j in { 1 .. n }. This observation leads to a recursive function generating all sets. Such a recursive function can be specified with:

   all_sol(a : nat, V : array[n] nat,
           S : array[m] nat) : P array[l] nat
   = if n = 0
     then if a = 0
          then { S }
          else {}
          endif
     else unnest(collect all_sol(a - V[1] * s, V[2..n], S:(s))
                 for s in { 0 .. a mod V[1] })
     endif

It is far from trivial to prove that all_sol(a, V, ()) is indeed equal to:

    { S : array[n] nat
    | (sum V[i] * S[i] for i in { 1 .. n }) = a
    }

Folding of operations

If we realise that count(all_sol(a, V, ()) only counts the number of solutions, and does not use the generated solutions, we can try to the count operation inside the expression. Which results in:

   c(a : nat, V : array[n] nat, S : array[m] nat) : nat
   = if n = 0
     then if a = 0
          then count({ S })
          else count({})
          endif
     else count(unnest(collect all_sol(a - V[1] * s, V[2..n], S:(s))
                       for s in { 0 .. a mod V[1] })
     endif

The count(unnest(collect .. )) construct can be simplified under the condition that all the sets produced by the calls of all_sol are disjunct. It can be shown that this is indeed true. With some more simplifications, this leads to:

   c(a : nat, V : array[n] nat, S : array[m] nat) : nat
   = if n = 0
     then if a = 0 then 1 else 0 endif
     else sum c(a - V[1] * s, V[2..n], S:(s))
          for s in { 0 .. a mod V[1] }
     endif

Now we observe that the parameter S has become obsolete. Removing it leads to:

   c(a : nat, V : array[n] nat) : nat
   = if n = 0
     then if a = 0 then 1 else 0 endif
     else sum c(a - V[1] * s, V[2..n])
          for s in { 0 .. a mod V[1] }
     endif

Specialisation

The easest route to the first solution in C, is to apply specialisation now. Specialisation (also called partial evalutation) is a technique by which a given part of code is simplified by providing some (but not all) of the imput values. What we will calculate is what c(a, (500, 250, 100, 25, 10, 5) would look like. We call the resulting function c1:

   c1(a : nat) : nat
   = if 6 = 0
     then if a = 0 then 1 else 0 endif
     else sum c(a - 500 * s, (250, 100, 25, 10, 5))
          for s in { 0 .. a mod 500 }
     endif

We notice that the above can be simplified, and that we can apply another specialisation of the function c. If we apply this several times, we get:

   c1(a : nat) : nat
   = sum c2(a - 500 * s) for s in { 0 .. a mod 500 }
   c2(a : nat) : nat
   = sum c3(a - 250 * s) for s in { 0 .. a mod 250 }
   c3(a : nat) : nat
   = sum c4(a - 100 * s) for s in { 0 .. a mod 100 }
   c4(a : nat) : nat
   = sum c5(a - 25 * s) for s in { 0 .. a mod 25 }
   c5(a : nat) : nat
   = sum c5(a - 10 * s) for s in { 0 .. a mod 10 }
   c6(a : nat) : nat
   = sum c5(a - 5 * s) for s in { 0 .. a mod 5 }
   c7(a : nat) : nat
   = if a = 0 then 1 else 0 endif

Inline expansion

It would be interesting to do some inline expansion for the last two functions. Inline expansion is a technique by which the call of a function is replaced by its body. We are using a `let ... in ... ' construct for the parameter expressions. We also rename the variable a of c7 into a1, which is not needed, but improved readability. This results in:

   c6(a : nat) : nat
   = sum
        let a1 = (a - 5 * s)
        in if a1 = 0 then 1 else 0 endif
     for s in { 0 .. a mod 5 }

Applying some simplifications, this results in:

   c6(a : nat) : nat
   = sum
        if (a - 5 * s) = 0 then 1 else 0 endif
     for s in { 0 .. a mod 5 }

Some reasoning about this expression, leads us to the conclusion that (a - 5 * s) = 0 cannot be true for s < a mod 5. This leads to the simplification:

   c6(a : nat) : nat
   = if (a - 5 * s) = 0 then 1 else 0 endif
     where ( s = a mod 5)

Another substitution leads to:

   c6(a : nat) : nat
   = if (a - 5 * (a mod 5)) = 0 then 1 else 0 endif

Which can be simplified into:

   c6(a : nat) : nat
   = if (a mod 5) = 0 then 1 else 0 endif

More inline expansion

With some additional inline expansion, and clever name picking, we come very close to our first C program:

    c(a : nat) : nat
    = sum
        let a1 = a - n500 * 500 in
        (sum
           let a2 = a1 - n250 * 250 in
           (sum
              let a3 = a2 - n100 * 100 in
              (sum
                 let a4 = a3 - n25 * 25 in
                 (sum
                    let a5 = a4 - n10 * 10 in
                    (if (a5 mod 5) = 0 then 1 else 0 endif)
                  for n10 in { 0 .. a4 mod 10 })
               for n25 in { 0 .. a3 mod 25 })
            for n100 in { 0 .. a2 mod 100 })
         for n250 in { 0 .. a1 mod 250 })
      for n500 in { 0 .. a mod 500 }

From functional to imperative

Although the above results looks very simular to the first C program, there is a very essential difference. C is an imperative program, which means that...

home page