Notes on Elements of Programming by Alexander Stepanov & Paul McJones

Table of Contents

1 Preamble

A good technical book starts by defining the words it uses. This book begins with precise definitions of elementary notions in programming such as value and equality. By not relying on intuition, the book proceeds to rigorously define and explore the foundations of generic programming. One of the authors was a creator of the C++ Standard Template Library.

This book uses a small subset of C++, focusing on templates and generic programming. It is also more mathematical than most books about programming.

Since I majored in mathematics and do software development for a living, I've chosen this book as a focal point for actively improving my skills and the quality of the work that I do. The best way for me to get the most out the book is to read it carefully, memorize the definitions, prove the lemmas, do the exercises, and publish all that work for others to see and criticize. If you have any comments, you can email me or submit a pull request on this file: https://github.com/tlehman/elem-prog

I wrote this in org-mode in the Emacs text editor. The main benefit is being able to edit and eval code directly in the same file as these notes. I export the org text to an html file that uses the MathJax library to render math symbols using LaTeX, it's pretty awesome :)

Success for this note-taking and reading will be if you can get as much out of the book from reading this document as I did reading the book and writing this. My goal is distillation of the whole book.

2 Chapter 1: Foundations

Programming is all about manipulating objects (e.g. 42), those objects have types (e.g. int), and those types cluster together into similar concepts (e.g. numbers).

2.1 Entity

2.1.1 Abstract entity

An individual thing that is eternal and unchanging. For example, the number 42.

2.1.2 Concrete entity

An individual thing that comes into existence in space and time. For example,

int n = 42;

The particular concrete entity n will only exist as long as it's in scope in the program, then it will eventually be overwritten.

2.2 Species

2.2.1 Abstract species

Describes properties of essentially equivalent abstract entities. Types are examples of species.

2.2.2 Concrete species

Describes properties of essentially equivalent concrete entities. Types are examples of species.

2.3 Genus

2.3.1 Abstract genus

A set of abstract species with properties in common. Examples: numbers and vector spaces are each genera (plural of genus)

2.3.2 Concrete genus

A set of concrete species with properties in common.

2.4 Data

A datum is a finite string of 0 s and 1 s. Data is the plural of datum. A datum corresponding to a particular entity is said to represent that entity.

2.5 Interpretation

The entity that corresponds to a datum is the interpretation of that datum.

2.6 Values

A value is a datum together with it's interpretation.

2.7 Equality

Two values are equal if they refer to the same abstract entity. This may sound circular, but it doesn't have to be, it reduces equality of values to equality in the mathematical theories those values model. For example, if the int type models ℤ, the integers, then we can use the definition of equality as equality of integers. Those integers are not values in our program, they are abstract entities, so there is no circularity, it just "bottoms out" in the Platonic realm.

2.8 Representational equality

Two values are representationally equal if they have identical strings of bits.

2.9 Value type

A value type is a correspondence between a species and a set of data.

2.10 Uniquely represented

A value type is uniquely represented if each datum corresponds to at most one abstract entity.

2.10.1 Examples:

Imagine a value type as 3-bit integers that use the first bit as a sign (positive or negative) and the remaining two bits to store the magnitude, then the set of all possible values are:

datum entity
000 0
001 1
010 2
011 3
100 0
101 -1
110 -2
111 -3

Notice how 000 and 100 represent 0 and -0, which are the same abstract entity, but different strings of bits, this value type is therefore not uniquely represented.

Now imagine a twos complement integer type:

datum entity
000 0
001 1
010 2
011 3
100 4
101 -3
110 -2
111 -1

Observe that none of the entities are the same, so this value type is uniquely represented.

2.11 Ambiguity

A value type is ambiguous if there is a datum that has more than one interpretation.

2.12 Lemma 1.1

2.13 Lemma 1.2

If a value type is unambiguous, then representational equality implies equality.

3 Chapter 2: Transformations and Their Orbits

3.1 Arity

The arity of a function is the number of parameters it takes. For example, \(\cos\) has an arity of 1. The addition function \(+\) has an arity of 2, and it takes two numbers \(x,y\) and then returns their sum \(x+y\).

A function of arity \(n\) is called \(n\) -ary.

Some common special cases have their own special names, like 1-ary functions are unary, and 2-ary functions are binary.

3.2 Homogeneous function

Any function where all the arguments have the same type.

3.3 Operations

An operation is a homogeneous function of nonzero arity.

Given a type \(T\), the function \(f : T \times T \times ... \times T \to T\) is an operation.

3.3.1 Examples

The familiar operations from arithmetic, like \(+\) and \(\times\) are operations, as are

3.4 Transformations

a transformation is a unary operation

For example, square is a transformation on int

template<typename Op>
    requires(BinaryOperation(Op))
Domain(Op) square(const Domain(Op)& x, Op op)
{
    return op(x, x);
}

Where we define Domain(.) using a macro. Observe that the requires(...) macro throws away everything, it is for documentation purposes only. In the year 2020, the C++20 standard will provide language-level support for concepts and requires constraints.

#define Domain(ftype) typename ftype::DomainType
#define requires(...)
#define DistanceType(ftype) std::size_t

An example of the ftype would be BinaryOperation

Testing out the square template function applied to Op, and then applied to 2 and 3:

template <typename T>
struct Transformation {
  typedef T DomainType;
  typedef T ReturnType;
  typedef std::function<T(T)> FuncType;
  Transformation(FuncType fn) {
    this->fn = fn;
  }
  FuncType fn;
  ReturnType operator()(DomainType x) {
    return fn(x);
  };
};
template <typename T>
struct BinaryOperation {
  typedef T DomainType;
  typedef T ReturnType;
  typedef std::function<T(T,T)> FuncType;
  FuncType fn;
  ReturnType operator()(DomainType x, DomainType y) {
    return fn(x,y);
  };
};

Now we tie it all together and try it out:

#define Domain(ftype) typename ftype::DomainType
#define requires(...)
#define DistanceType(ftype) std::size_t
template<typename Op>
    requires(BinaryOperation(Op))
Domain(Op) square(const Domain(Op)& x, Op op)
{
    return op(x, x);
}
template <typename T>
struct BinaryOperation {
  typedef T DomainType;
  typedef T ReturnType;
  typedef std::function<T(T,T)> FuncType;
  FuncType fn;
  ReturnType operator()(DomainType x, DomainType y) {
    return fn(x,y);
  };
};

int main() {
  typedef BinaryOperation<int> Op;
  Op mult;
  mult.fn = [](int a, int b) { return a * b; };

  cout << square<Op>(2, mult) << endl
      << square<Op>(3, mult) << endl
      << square<Op>(4, mult) << endl;
}

RESULTS:

4
9
16

3.5 Orbits

Given a transformation \(f\) and a point \(x\), we define the orbit as the set \(O_{x,f}= \{x, f(x), f(f(x)), ... \}\)

3.5.1 Distance

If \(x\) and \(y\) are both of type \(T\), and the transformation \(f\) can be applied finitely many times so that \(y = f(f(....f(x)))\), then the number of times you have to apply \(f\) to \(x\) to get \(y\) is the distance from \(x\) to \(y\).

We can define the distance function in C++ like so:

template<typename F>
    requires(Transformation(F))
DistanceType(F) distance(Domain(F) x, Domain(F) y, F f)
{
  // Precondition: y is reachable from x under f
  typedef DistanceType(F) N;
  N n(0);
  while(x != y) {
    x = f(x);
    n = n + N(1);
  }
  return n;
}

The DistanceType(F) type is defined to be any integer type that is exactly large enough to hold the distance between any two points x and y. Given a type \(T\) with finitely many values of that type, the maximum distance with respect to a transformation \(f : T \to T\) would have to be the number of values of type \(T\).

In C++, we can use std::size_t to define our DistanceType:

#define DistanceType(ftype) std::size_t

Now let's define a particular transformation and compute the distance between two points. Let the transformation be the increment function on the integers: \(\text{inc} : \mathbb{Z} \to \mathbb{Z}\), then the distance between 13 and 23 should be 10.

#define DistanceType(ftype) std::size_t
#define Domain(ftype) typename ftype::DomainType
#define requires(...)
#define DistanceType(ftype) std::size_t
template <typename T>
struct Transformation {
  typedef T DomainType;
  typedef T ReturnType;
  typedef std::function<T(T)> FuncType;
  Transformation(FuncType fn) {
    this->fn = fn;
  }
  FuncType fn;
  ReturnType operator()(DomainType x) {
    return fn(x);
  };
};
template<typename F>
    requires(Transformation(F))
DistanceType(F) distance(Domain(F) x, Domain(F) y, F f)
{
  // Precondition: y is reachable from x under f
  typedef DistanceType(F) N;
  N n(0);
  while(x != y) {
    x = f(x);
    n = n + N(1);
  }
  return n;
}

int main() {
  typedef Transformation<int> F;
  F inc([](int a) { return a+1; });

  cout << distance(13, 23, inc) << endl;
  return 0;
}

RESULTS: 10

As expected, we got 10. Next we move on to collision points.

3.6 Collision points

If an orbit is finite, then we can find the point where a transformation \(f\) loops back on itself by running two sequences, one at 1x speed and one at 2x speed. We seek the smallest \(n\) such that \(f^n(x) = f^{2n + 1}(x)\). A more detailed discussion of the algorithm is in the book, for here I include the C++ code, and test it out on several transformations, and do some of the exercises to modify and apply this code.

template<typename F, typename P>
  requires(Transformation(F) && UnaryPredicate(P))
Domain(F) collision_point(const Domain(F)& x, F f, P p) {
  // Precondition: p(x) and f(x) are defined
  if(!p(x)) return x; 
  Domain(F) slow = x;
  Domain(F) fast = f(x);
  while(fast != slow) {
    slow = f(slow);
    if(!p(fast)) return fast;
    fast = f(fast);
    if(!p(fast)) return fast;
    fast = f(fast);
  }
  return fast;
}

If the orbit is nonterminating for a particular element \(x\), then we can specialize the collision_point algorithm:

template<typename F>
  requires(Transformation(F))
Domain(F) collision_point_nonterminating_orbit(const Domain(F)& x, F f) {
  Domain(F) slow = x;
  Domain(F) fast = f(x);

  while(fast != slow) {
    slow = f(slow);
    fast = f(fast);
    fast = f(fast);
  }
  return fast;
  // Postconditon: return value is a collision point
}

To test this out, let's consider a simple transformation, that of incrementing an integer and taking the remainder modulo 13.

typedef Transformation<int> F;
F inc([](int a) { return (a+1) % 13; });

If we take the number 6, and then apply the transformation 6 times, we reach 12, applying it again we get 0, so the distance between 6 and 0 should be (6 + 1) = 7

3.6.1 testing distance with inc

#define DistanceType(ftype) std::size_t
#define Domain(ftype) typename ftype::DomainType
#define requires(...)
#define DistanceType(ftype) std::size_t
template <typename T>
struct Transformation {
  typedef T DomainType;
  typedef T ReturnType;
  typedef std::function<T(T)> FuncType;
  Transformation(FuncType fn) {
    this->fn = fn;
  }
  FuncType fn;
  ReturnType operator()(DomainType x) {
    return fn(x);
  };
};
template<typename F>
    requires(Transformation(F))
DistanceType(F) distance(Domain(F) x, Domain(F) y, F f)
{
  // Precondition: y is reachable from x under f
  typedef DistanceType(F) N;
  N n(0);
  while(x != y) {
    x = f(x);
    n = n + N(1);
  }
  return n;
}

int main() {
  typedef Transformation<int> F;
  F inc([](int a) { return (a+1) % 13; });
  cout << distance(6, 0, inc) << endl;
  return 0;
}

RESULTS: 7

Works correctly! Now let's move on to finding the collision point:

3.6.2 testing collision_point using inc

If we take our inc operation and imagine playing out the collision finding algorithm starting at 0, fast takes 6 steps to get to 12, slow is at 5. After 7 more steps, slow is at 12, and fast has gone (1+6) to hit 12 again. The call will look like this:

collision_point_nonterminating_orbit(0, inc);

And we expect the result to be 12. Weaving all the code together:

#define Domain(ftype) typename ftype::DomainType
#define requires(...)
#define DistanceType(ftype) std::size_t
template <typename T>
struct Transformation {
  typedef T DomainType;
  typedef T ReturnType;
  typedef std::function<T(T)> FuncType;
  Transformation(FuncType fn) {
    this->fn = fn;
  }
  FuncType fn;
  ReturnType operator()(DomainType x) {
    return fn(x);
  };
};
template<typename F>
  requires(Transformation(F))
Domain(F) collision_point_nonterminating_orbit(const Domain(F)& x, F f) {
  Domain(F) slow = x;
  Domain(F) fast = f(x);

  while(fast != slow) {
    slow = f(slow);
    fast = f(fast);
    fast = f(fast);
  }
  return fast;
  // Postconditon: return value is a collision point
}

int main() {
  typedef Transformation<int> F;
  F inc([](int a) { return (a+1) % 13; });
  cout << collision_point_nonterminating_orbit(0, inc) << endl;
  return 0;
}

Beautiful! Next we should take a detour and make some visualizations. Since I'm a nerd I will of course do this with Graphviz.

3.7 Visualizing orbits!

Our implementation of the transformation type is flexible enough to allow for wrapping inc and producing some output along the way:

typedef Transformation<int> F;

F inc([](int a) {
    int fa = (a+1) % 13; 
    cout << a << "->" << fa << "; ";
    return fa;
});

Now we re-run the previous program with the new inc

#define Domain(ftype) typename ftype::DomainType
#define requires(...)
#define DistanceType(ftype) std::size_t
template <typename T>
struct Transformation {
  typedef T DomainType;
  typedef T ReturnType;
  typedef std::function<T(T)> FuncType;
  Transformation(FuncType fn) {
    this->fn = fn;
  }
  FuncType fn;
  ReturnType operator()(DomainType x) {
    return fn(x);
  };
};
template<typename F>
  requires(Transformation(F))
Domain(F) collision_point_nonterminating_orbit(const Domain(F)& x, F f) {
  Domain(F) slow = x;
  Domain(F) fast = f(x);

  while(fast != slow) {
    slow = f(slow);
    fast = f(fast);
    fast = f(fast);
  }
  return fast;
  // Postconditon: return value is a collision point
}

int main() {
  typedef Transformation<int> F;

  F inc([](int a) {
      int fa = (a+1) % 13; 
      cout << a << "->" << fa << "; ";
      return fa;
  });
  cout << collision_point_nonterminating_orbit(0, inc) << endl;
  return 0;
}

In emacs, run org-bable-tangle Then eval this next src block:

g++ collision_point.cpp -std=c++14; echo "strict digraph { $(./a.out) }" | circo -Tpng > inc_13.png

inc_13.png

Not all orbits will be perfectly circular, consider the inc function on the integers, it never terminates. Also, consider half which takes the half of an integer using integer division. half always terminates at 0, meaning that there is no collision point for either of those orbits.

There's a fourth possibility, that there is a "handle" and then a nice circular orbit. Let's find an example and illustrate is using our neat Graphviz trick.

typedef Transformation<int> F;

F handle_tx([](int a) {
    int fa = (a < 4) ? a+1 : (a%5 + 4);
    cout << a << "->" << fa << "; ";
    return fa;
});

Now we re-run the previous program with handle_tx in place of inc

#define Domain(ftype) typename ftype::DomainType
#define requires(...)
#define DistanceType(ftype) std::size_t
template <typename T>
struct Transformation {
  typedef T DomainType;
  typedef T ReturnType;
  typedef std::function<T(T)> FuncType;
  Transformation(FuncType fn) {
    this->fn = fn;
  }
  FuncType fn;
  ReturnType operator()(DomainType x) {
    return fn(x);
  };
};
template<typename F>
  requires(Transformation(F))
Domain(F) collision_point_nonterminating_orbit(const Domain(F)& x, F f) {
  Domain(F) slow = x;
  Domain(F) fast = f(x);

  while(fast != slow) {
    slow = f(slow);
    fast = f(fast);
    fast = f(fast);
  }
  return fast;
  // Postconditon: return value is a collision point
}

int main() {
  typedef Transformation<int> F;

  F handle_tx([](int a) {
      int fa = (a < 4) ? a+1 : (a%5 + 4);
      cout << a << "->" << fa << "; ";
      return fa;
  });
  cout << collision_point_nonterminating_orbit(0, handle_tx) << endl;
  return 0;
}
g++ collision_point_handle.cpp -std=c++14; echo "strict digraph { $(./a.out) }" | circo -Tpng > handle_tx.png

handle_tx.png

3.8 Lemma 2.4

Let \(o\) be the orbit size, \(h\) the handle size, and \(c\) the cycle size.

\(o = h + c\)

3.8.1 Proof

Suppose there was a point \(x\) in the orbit that was neither in the handle nor in the cycle.

Let \(y\) be any point on the cycle, there can't be any path from \(y\) to \(x\), since the cycle property would require that there is a path from \(x\) back to \(y\).

Let \(z\) be a point on the handle, there can't be any path from \(z\) to \(x\), since that would mean \(x\) is part of the handle, or the cycle, both are forbidden by definition. There is no where left that \(x\) could go, which means that \(o = h + c\).

3.9 Connection points

Earlier we explored the collision point where \(f^{n}(x) = f^{2n+1}\). That tells us that the orbit has a cycle, but if we want to know how big the handle is (the number \(h\)), then we need to do more. In the book, there is a proof that taking \(h+1\) steps past the collision point visits the same point as taking \(h\) steps past \(x\).

\(f^h(x) = f^{h+1}(y)\) where \(y\) is the collision point.

Since we don't know \(h\), we use this proof to run two sequences \(\{x, f(x), \cdots\}\) and \(\{f(y), f(f(y)), \cdots\}\) forward until they converge, we call this the convergent point. By the proof in the book, we also know that this is the connection point between the handle and the cycle.

template <typename F>
  requires(Transformation(F))
Domain(F) convergent_point(Domain(F) x0, Domain(F) x1, F f) {
  while(x0 != x1) {
    x0 = f(x0);
    x1 = f(x1);
  }
  return x0;
} 

Then we can apply the convergent_point function to find the connection point of the orbit starting at \(x\):

template <typename F>
  requires(Transformation(F))
Domain(F) connection_point_nonterminating_orbit(const Domain(F)& x, F f) {
  Domain(F) y = collision_point_nonterminating_orbit(x, f);
  return convergent_point(x, f(y), f);
} 

3.9.1 testing out connection_point on the handle_tx transformation

Earlier we defined handle_tx and visualised it's orbit structure, it looks like this:

typedef Transformation<int> F;

F handle_tx([](int a) {
    int fa = (a < 4) ? a+1 : (a%5 + 4);
    return fa;
});

handle_tx.png

From the image above, you can see that 4 is the connection point, let's try it out!

#define Domain(ftype) typename ftype::DomainType
#define requires(...)
#define DistanceType(ftype) std::size_t
template <typename T>
struct Transformation {
  typedef T DomainType;
  typedef T ReturnType;
  typedef std::function<T(T)> FuncType;
  Transformation(FuncType fn) {
    this->fn = fn;
  }
  FuncType fn;
  ReturnType operator()(DomainType x) {
    return fn(x);
  };
};
template<typename F>
  requires(Transformation(F))
Domain(F) collision_point_nonterminating_orbit(const Domain(F)& x, F f) {
  Domain(F) slow = x;
  Domain(F) fast = f(x);

  while(fast != slow) {
    slow = f(slow);
    fast = f(fast);
    fast = f(fast);
  }
  return fast;
  // Postconditon: return value is a collision point
}
template <typename F>
  requires(Transformation(F))
Domain(F) convergent_point(Domain(F) x0, Domain(F) x1, F f) {
  while(x0 != x1) {
    x0 = f(x0);
    x1 = f(x1);
  }
  return x0;
} 
template <typename F>
  requires(Transformation(F))
Domain(F) connection_point_nonterminating_orbit(const Domain(F)& x, F f) {
  Domain(F) y = collision_point_nonterminating_orbit(x, f);
  return convergent_point(x, f(y), f);
} 
template <typename F>
  requires(Transformation(F))
triple<DistanceType(F), DistanceType(F), Domain(F)>
orbit_structure_nonterminating_orbit(const Domain(F)& x, F f) {
  typedef DistanceType(F) N;
  Domain(F) y = connection_point_nonterminating_orbit(x, f);
  return triple<N, N, Domain(F)>(distance(x, y, f),
                                 distance(f(y), y, f),
                                 y);

} 

int main() {
  typedef Transformation<int> F;

  F handle_tx([](int a) {
      int fa = (a < 4) ? a+1 : (a%5 + 4);
      return fa;
  });
  cout << connection_point_nonterminating_orbit(0, handle_tx) << endl;
  return 0;
}

RESULTS: 4

As expected!

3.9.2 Measuring orbit sizes

Of the three finite types of orbits (circular, ρ-shaped and terminating), we can summarize the structure of each type of orbit using this table:

Case      
Terminating \(h-1\) 0 terminal element
Circular 0 \(c-1\) \(x\)
ρ-shaped \(h\) \(c-1\) connection point

Aside, to return a triple of numbers in C++, we could use a struct, a class, or a std::tuple<> the book uses triple<DistanceType(F), DistanceType(F), Domain(F)>, so I will find a way to use templates or macros to make the book's syntax compile. Future me: please wish me luck.

template<typename A, typename B, typename C>
using triple = std::tuple<A,B,C>;

Awesome, that compiled! I wrote that on the bus.

template <typename F>
  requires(Transformation(F))
triple<DistanceType(F), DistanceType(F), Domain(F)>
orbit_structure_nonterminating_orbit(const Domain(F)& x, F f) {
  typedef DistanceType(F) N;
  Domain(F) y = connection_point_nonterminating_orbit(x, f);
  return triple<N, N, Domain(F)>(distance(x, y, f),
                                 distance(f(y), y, f),
                                 y);

} 

3.10 Exercise 2.5

Use orbit_structure_nonterminating_orbit to determine the average handle size and cycle size of the pseudorandom number generators on your platform for various seeds. The C standard library <stdlib.h> has a pseudorandom number generator, you use it by setting a seed with srand(seed) and then get the next element of the sequence using rand()

#define Domain(ftype) typename ftype::DomainType
#define requires(...)
#define DistanceType(ftype) std::size_t

template<typename A, typename B, typename C>
using triple = std::tuple<A,B,C>;

template<typename F>
    requires(Transformation(F))
DistanceType(F) distance(Domain(F) x, Domain(F) y, F f)
{
  // Precondition: y is reachable from x under f
  typedef DistanceType(F) N;
  N n(0);
  while(x != y) {
    x = f(x);
    n = n + N(1);
  }
  return n;
}

template <typename T>
struct Transformation {
  typedef T DomainType;
  typedef T ReturnType;
  typedef std::function<T(T)> FuncType;
  Transformation(FuncType fn) {
    this->fn = fn;
  }
  FuncType fn;
  ReturnType operator()(DomainType x) {
    return fn(x);
  };
};

template<typename F>
  requires(Transformation(F))
Domain(F) collision_point_nonterminating_orbit(const Domain(F)& x, F f) {
  Domain(F) slow = x;
  Domain(F) fast = f(x);

  while(fast != slow) {
    slow = f(slow);
    fast = f(fast);
    fast = f(fast);
  }
  return fast;
  // Postconditon: return value is a collision point
}

template <typename F>
  requires(Transformation(F))
Domain(F) convergent_point(Domain(F) x0, Domain(F) x1, F f) {
  while(x0 != x1) {
    x0 = f(x0);
    x1 = f(x1);
  }
  return x0;
} 

template <typename F>
  requires(Transformation(F))
Domain(F) connection_point_nonterminating_orbit(const Domain(F)& x, F f) {
  Domain(F) y = collision_point_nonterminating_orbit(x, f);
  return convergent_point(x, f(y), f);
} 
template <typename F>
  requires(Transformation(F))
triple<DistanceType(F), DistanceType(F), Domain(F)>
orbit_structure_nonterminating_orbit(const Domain(F)& x, F f) {
  typedef DistanceType(F) N;
  Domain(F) y = connection_point_nonterminating_orbit(x, f);
  return triple<N, N, Domain(F)>(distance(x, y, f),
                                 distance(f(y), y, f),
                                 y);

} 

int main() {
  typedef Transformation<int> F;

  F f([](int seed) {
    srand(seed);
    return rand();
  });

  int x = 1;
  orbit_structure_nonterminating_orbit(x, f);

  return 0;
}

4 Chapter 3: Associative Operations

This chapter explores associative binary operations, which are operations of arity 2, with the property that:

op(a, op(b, c)) = op(op(a, b), c) for all a,b,c, \in T

By convention, if the type T and operation op are known in context, we will use multiplicative notation:

\(a(bc) = (ab)c\), \(\forall a,b,c \in T\)

Author: Tobi Lehman <tobi.lehman@gmail.com>

Created: 2018-09-29 Sat 07:52

Validate