Extremely Loud & Incredibly Close: pages 269-271

This pages details my attempts to decypher the number sequences given on the pages 269-271 of Extremely Loud & Incredibly Close.

Raw OCR output

The following raw OCR output was produced by FreeOCR version 3.0 (on July 26, 2011) from some 300dpi scans of the pages 269-271.
  "6, 9, 6, 2,6, 2,4, 7, 2, 2,4, 2, 2, 2,8, 6, 2,6, 2, 4, 2, 8, 7, 8, 2, 7, 7, 4, 8, 2, 2 *
 ? 2, 2, 8, 8, 4, 2, 2, 4, 7, 7, 6, 7, 8, 4, 6, 2, 2, 2, 8, 6, 2, 4, 6, 2, 6, 7, 2, 4, 6, 2,
  2, 2, 7F 6, 4, 2, 2, 2, 6, 7, 4, 2, 2, 6, 2, 8, 7, 2, 6, 2, 4, 28 2, 7, 6, 2, 2, 8, 6, 2,
 2s      6, 2, 4, 2, 8, 7, 8, 2, 7, 7, 4, 8, 2, 9, 2, 8, 8, 4, 2, 2, 4, 7, 7, 6, 7, 8, 4, 6, 2, 2,
 2, 2, 8F 4, 2, 2, 4, 7, 7, 6, 7, 8,4, 6, 2, 2, 2, 8, 6, 2, 9, 6, 2, 6, 6, 2, 4, 6, 2, 2, 2, 1*
Z?  7! 6, 4, 3, 2, 2, 6, 7, 4, 2, 5, 6, 3, 8, 7, 2, 6, 3, 4, 3E 5, 7, 6, 3, 5, 8, 6, 2, 6, 3, 3
 , 4, 2, 8, 7, 8, 2, 7, 7, 4, 8, 2, 2, 2, 8F 7, 7, 4, 8, 2, 2, 2, 8, 2, 4, 2, 2, 4, 7,6, 6,
  7, 8, 4, 6, 8, 2, 8, 8,6, 2, 4, 6, 2, 6, 7, 2, 4, 6, 7, 7, 4, 8, 2, 2, 9, 8, 8, 4, 2, 2, 4*,
  4, 2, 7, 6, 7, 8, 4, 6,  2, 2, 2, 2, 6, 9, 4, 6, 2, 6, 7, 2, 4, 6F 2, 2, 6, 2, 6, 2, 9, 2,
  28 6, 9, 6, 2, 6, 2, 4, 7, 2, 2, 4, 2, 2, 2, 2, 6, 4, 6, 2, 4, 2, 2, 7, 2, 2, 7, 7, 4, 2, ,,6
  5, 5, 2, 9, 2, 4, 5, 2, 6! 4, 2, 2, 6, 5, 4, 2, 5, 7, 4, 5, 2., 5, 2, 6, 2, 6, 5, 4, 5, 2,
 , 7, 2, 2, 7, 7,4, 2, 2, 5, 2, 2, 2,4, 2, 2? 7, 2, 2, 7, 7,4, 2, 2, 2, 2, 2, 2,4, 5,2,
{  4- 7, 2, 2, 7, 2, 4, 6, 2, 2, 2, 2, 6, 2, 4, 6, 2, 6, 7, 2, 4! 4, 2, 2, 4, 2, 2, 6, 2, 8, N
is  4* 6, 2, 2, 2, 8, 6, 2, 9, 6, 2, 6, 6, 2, 4, 6, 2, 2, 2, 2! 2, 2, 2, 2, 2, 6, 2, 4, 2, 2, [4
  6, ,,8, 3,2,6,3,4,3? 5, 6, 8,3? 5, 37, 6,3,5,8, 6,2,6, 3, 4, 5, 8,3,8, 2, 3,

4, 8, 2, 2, 2, 88 2, 2,4, 8, 2, 2, 2, 8, 2,4, 2, 2,4, 7, 6, 6, 7, 8,4, 6, 8, 2, 8, 8,
6, 2,4,6, 28 2, 2, 7, 7,4, 2, 2, 2, 2, 9, 2,4, 2, 2, 68 4, 2, 2, 6, 2,4, 2, 2, 7,4,
5, 2, 5, 2, 6,2, 6, 5, 4, 5, 2, 7, 2, 2, 7, 7, 4, 2, 5, 5, 2,2, 2, 4, 5, 28 7, 2, 2, 7,
7, 4, 2,2, 2, 2,2, 2,4, 2, 2,4, 7,2,2,7, 2,4, 6,2,2,2,2,6, 2, 4, 6,2,6, 7,
2,48 6, 2, 2, 2, 78 6,4, 2, 2, 2, 6, 7,4, 2, 2, 6, 2, 2, 68 2, 6, 2,4, 28 2, 7,6, 2,
2,2,6, 2,6,2,4, 2, 2,7, 2, 2, 7, 7, 4,2, 2,9, 2, 2,2,4,2, 2,4, 2, 2, 6,2, 2,
4, 6, 2, 2, 2,28 4, 2,2,4,2,2,6,28 2,6, 8,28 2, 2,6, 2, 2,2,6,2,6, 2,4, 2,
8,2,8, 2, 2, 2,4, 8, 2,9,2,8, 8, 4, 2,2,4, 2, 2,6, 2,8, 4, 6, 3,3,3,88 4,2,
2,4, 3, 3, 6, 3, 8,4,6, 3! 5, 6, 8, 3? 5,6, 8, 3? 5, 6, 8, 3! 4, 2, 2,6, 5,4, 2, 5,
7, 4, 2,2,2,2,6, 2,6,2,4, 2,2,7,2, 2, 7, 4, 2,2,4, 6, 2,2,8, 6,2,6,2,4,
2,8, 7, 8,2,7, 7,4, 8,2,2,2,88 6,2,2,2,78 6,4, 2,2,2,6, 7,4, 2,2,6,2,
2, 68 2, 6, 2,4, 28 2, 7, 6, 2, 2, 2, 6, 2, 6, 2,4, 2, 2, 7, 2, 2, 7, 7,4, 2, 2, 9, 2,
2, 2,4, 2, 2,48 2, 6, 8, 28 2, 2,6, 2, 2,4,6, 2, 6, 7, 2,4, 6, 7, 7,4, 8, 2, 2,9,
8, 8, 4, 2, 2, 4, 2, 7, 6, 7, 8, 4, 6, 2, 2, 2, 2, 6, 9, 4, 6, 2, 6, 7, 2, 4, 68 2, 2, 6,
2,6, 2, 9, 2,28 6, 9, 6,2,6, 2, 4, 7, 2,2,4,2, 2,2,2,6, 4, 6,2,4,2,2,7, 2,,
2,7, 7, 4, 2,2,2,2,9,2,4,2, 2,68 4,2,2,6,2,4, 2,2,7, 4,2,2,2,2,6,2,
6,2,4,2,2, 7,2,2,7, 7,4, 2,2,2,2,2,2,4,2, 28 7,2, 2,7,7,4, 2,2,2,2,
2,2,4,2,2,4, 7,2,2,7,2,4, 6, 2, 2,2, 2,6,2,4, 6,2,6, 7,5,4I 6, 2, 2, 2,
7! 6, 4, 5, 2, 2, 6, 7, 4, 2, 5, 6, 5, 2, 6! 2, 6, 5, 4, 5? 5, 7, 6, 5, 5, 2, 6, 2, 6, 5,
4, 2,2,7,2,2,7, 7,4,2,5,9,2,2,2,4,S,2,4! 2, 6, 8,28 2, 2, 6,2,2,4, 6,
2,2, 5,284, 2,2,4,2,2, 6,28 2,5,5,2,9,2,4,5, 2,68 4,2,2,6,5,4,2!.5,
2, 6,2, 2,2,6,2,6, 2,4,2,8, 2,8,2, 2, 2,4, 8, 2,9,2,8, 8, 4, 2,2,4,2,2,
6,2,8, 4, 6,2,2, 2,884, 2,2,4,2,2,6,2,8, 48 6, 2,2,2,8, 6,2,9, 6,2,6,
6,2,4, 6, 5,3,5,3I 2,2,2,2,2,6,2,4,2,2, 6,2,8,2,2,6,2,4,28 2, 6, 8,
3?5,3,6,3,5,8,6,2,6.,3,4, 2, 8,2,8,2,2,2,4, 8,2,2,2,88 2,7,2,4,6,
2, 2, 2,2,6, 2, 4, 6, 2, 6, 7, 2, 486,2,2, 2, 78 6, 4, 2,2,2,6, 7,4,2,2, 6, 2,
2,68 2,6,2,4,28 2,7, 6,2,2,2,6,2,6, 2, 4, 2,2,7,2,2,7, 7, 4,2,2,9,2,
2,2,4, 2,2,4, 2, 2, 6,2,2,4, 6, S,5,5,2!4, 2, 2,4, 2,2,6,28 2, 6, 8,28 2,
2,6,2,2,2,6,2,6,2,4,2,8,2,8,2,2,2,4,8,2,9,2,8,8,4,2,2,4,2,4,
6,5,5,5,284,5,2,4,5,5,6,5!,6,5,4,5?4,5?5,5,6,5,5,2,6,2,6,3,4» *
2,8,2,8,2,2,2,4,8,2,9,2,8,8,4,2,2,4,2,2,6,2,8,4,6,2,2,2,884,
2,2,4,2,2,6,2,8,486,2,2,2,6,.7,4,2,2,6,2,8,7,2,6,2,4,282,7,6, 2
2,2, 8, 6,2,6,2,4, 2, 8, 7, 8, 2,7, 7,4, 8,2,2,2,88 7, 7,4, 8, 2,2,2,8,2,
4,2,2,4,7,6,6,7,8,4,6,8,2,8,8,6,2,4,6,2,6,7,2,4,6,7,7,4,8,2, 8
2,9,8,8,4,2,2,4,2,7,6,7,8,4,6,2,2,2,2,6,9,4,6,2,6,7,2,4,682,

2,6, 2,6, 5, 9,_5, 2? 6, 9, 6,2,6, 5,4, 7,5, 5,4,5, 1, 5, 1,6, 4, 6, 4,,,, 5, 2, 7,
2, 2, 7,7,4, 2, 5, S, 2,9, 2,4,5, 2,6] 4,2, 2,6, 5,4, ,2, 5, 7,4, li, gl, 5, 3,6, 2,
6,5,4,5,2,7,2,2,7,7,4,2,5,5,2,2,2,4,5,2!7,2,2,7,7,,*,2,5,5,2,2, .
2,4, 5, 2,4, 7, 2, 2, 7, 2,4, 6, 5, 5, 5, 2, 6,5,4,6, 5, 6, 7, 5, 4] FI, 5, 5, 5, 7Y 6,
4, 5, 2, 2, 6, 7,4, 2, S, 6,5, 2,6! 2, 6, 5,4, S? 5, 7,6, 5,5,1, 6, 1,6, 5,,], 5, 2,
7, 2, 2, 7, 7,4,2,5,9, 2,2,2,4, 5,2,4I 5,6, 8, 3P 5,5,6, 5, ,:,4, 6, 5, 5, 5, .n!
4,5, 2,4, 5, 5, 6,5F 8, 6, 5,9, 6, 5,6, 6, 5,4, 6, 5, 5,5, 5, 2, 2, 5, 5, ,5.6, 5,4,
2, 5,6, 3, 8, 3, 2,6, 3,4, 3P 5,6, 8, 3P 5, 3,6, 3, 5, 8,6, 2,6, 4,4, 5, H, 4, H,
2,5, 5,4, 8»3¤3,2,8I 5,5,4, 8,5, 5,2,8,5,4,5,2,4, 7, 6, 6, 7·H,·¤I,'*,*'*, 5,
8,8, 6,3,4, 6,5F 2,2,7, 7,4, 6, 7, 4,2,5,6,5,8, 7,2,6,5,4, .58 5,7,**, 5, 5,
8,6, 2,6, 5,4,5,8, 7,8, 2, 7, 7,4,8, 3,3,2,8E 7, 7,4,8, 5, 5, 2, 8, .5,4, 5, 2,
4, 7, 6,6, 7, 8,4,6, 8,5,8, 8, 6,5,4, 6,5,6, 7,5,4, 6, 7, 7,4, 8,5,.5,9, 8,8,
4,5,2,4,5,7, 6, 7, 8,4, 6,5,5, 5,2,6, 9,4,6,5,6, 7,5,4, 68 5,2·('·8·('·$·
9, 5,286, 9, 6»2»6» 5,4, 7,5,5,4,5,2,5,2,6,4, 6,2,4,5,2,7,2,2,7,7,4,
2,5,5,2,9,2,4,5,2,6!4,2,2,6,5,4,2,5,7,4,5,2,5,2,6,2,6,5,4,5,2,
7,2,2,7, 7,4,2,S,5,2,2,2,4, 5,2!7,2,2,7, 7,4,2,5, 5,2,2,2,4, 5,2,4,
7,2,2,7,2,4,6, 5, 5, 5,2,6, 5,4, 6, 5,6, 7,5,4I 6, 5, 5, 5,7!6,4, 5,2,2,6,
7,4=2,5,6,5,2,6! 2,6, 5,4,585, 7,6, 5,5,2,6,2,6, 5,4, 5,2,7,2,2,7, 7,
4, 2, 5, 9, 2, 2, 2, 4, 5, 2,4I 5, 6, 8, 58 5, 5, 6, 5, 2, 4, 6, 5, 5, 5, zi 4, 5, 2, 4,
5, 5, 6, 5I 2, 5, 5, 2,9, 2,4, 5, 2,6! 4, 2, 2,6, 5,4, 2! 5, 5,6, 5, 5, 2,6, 2, 6, 3,
»5,5,8,5,8,2,5, 5,4, 8,5,9,2,8, 8,4,5,2,4,5,5,6,5,8,4,6,5,5,5,8!4,
5,2,4,5,5,6,5,8,4! 6,5,5,5,8, 6,5,9,6,5,6,6,5,4,6, 5,5,5,5!2,2,5,
5, 2, 6, 5, 4, 2, 5, 6, 5, 8, 5, 2, 6, 5, 4, 58 5, 6, 8, 58 5, 5, 6, 5, 5, 8, 6, 2, 6, 5,
·|-,5,8,3,8,2,3,3»4,8,3,3,2,8! 2,7, 2,4,6,5,5, 5,2,6,5,4,6, 5,6, 7, 5,
4F 6, 5, 5, 5, 7Y 6,4, 5, 2, 2, 6, 7,4, 2, 5, 6, 5, 2, 6F 2, 6, 5,4, 58 5, 7, 6, 5, 5,
2,6,2,6,5,4, 5,2,7,2,2,7, 7,4,2,5, 9,2,2,2,4,5,2,4, 5,5,6,5,2,4,6,
5, 5, 5, 2F 4, 5, 2, 4, 5, 5, 6, 5F 5, 6, 8, 58 5, 5, 6, 5, 5, 2, 6, 2, 6, 5, 4, 5, 8, 5,
8,2,5,5,4, 8,5,9,2,8, 8,4,3,2,4,3»3,6,3,8»4»6,3,3,3,8_! 4,5,2,4,5,
5,6,5,8, 4, 6,5F5,6, 8,585,6, 8,585, 6, 8,5!4,2,2,6,5,4,2,5, 7,4, 5,
-·,5,2,6,2,6,5,4,5,2,7,2,2,7,4,5,2,4,6,5,5,8,6,2,6,5,4, 5,8, 7,8,
-·,7, 7,4, 8,5,5,2,8F 7, 7,4, 8,5,5,2,8,5,4,5,2,4, 7, 6,6, 7, 8,4,6, 8,5,
*'·H~6,3»4¤6,3,6,7,3,4,6,7,7%,8,3,3,9,8,8·4,3,2»4,5,7,6»7,8,4, 6
6, 5, 5, 5, 2,6, 9,4, 6, 5, 6, 7, 5,4,6F 5,2,6,2,6, 5,9, 5,28 6, 9, 6,2,6,5,4,
  6, 5, 2,4, 6, 5, 5, 5, 2, 7,4, 2, 5, 5, 2, 2, 2,4, 5, 2F 7, 2, 2, 7, 7,4, 2, 5, 5, 2,
  5,2,4, 7,2,2,7,2,4,6,5,5,5,2,6,5,4,6,5,6, 7,5,4F 6, 5,5, 5,7!"

Corrected OCR output

The raw OCR output needed some correcting. I used MySample editor, which can display both text and images files, to perform the editing. (On August 14, 2011, I made an additional correction. On the 20th line of page 270, replaced "5, 6, 6" by "5, 5, 6". Also the errors that I found on June 9, 2013 have been corrected.)
"6, 9, 6, 2, 6, 3, 4, 7, 3, 5, 4, 3, 2, 5, 8, 6, 2, 6, 3, 4, 5, 8, 7, 8, 2, 7, 7, 4, 8, 3,
3, 2, 8, 8, 4, 3, 2, 4, 7, 7, 6, 7, 8, 4, 6, 3, 3, 3, 8, 6, 3, 4, 6, 3, 6, 7, 3, 4, 6, 5,
3, 5, 7! 6, 4, 3, 2, 2, 6, 7, 4, 2, 5, 6, 3, 8, 7, 2, 6, 3, 4, 3? 5, 7, 6, 3, 5, 8, 6, 2,
6, 3, 4, 5, 8, 7, 8, 2, 7, 7, 4, 8, 3, 9, 2, 8, 8, 4, 3, 2, 4, 7, 7, 6, 7, 8, 4, 6, 3, 3,
3, 8! 4, 3, 2, 4, 7, 7, 6, 7, 8, 4! 6, 3, 3, 3, 8, 6, 3, 9, 6, 3, 6, 6, 3, 4, 6, 5, 3, 5,
7! 6, 4, 3, 2, 2, 6, 7, 4, 2, 5, 6, 3, 8, 7, 2, 6, 3, 4, 3? 5, 7, 6, 3, 5, 8, 6, 2, 6, 3,
4, 5, 8, 7, 8, 2, 7, 7, 4, 8, 3, 3, 2, 8! 7, 7, 4, 8, 3, 3, 2, 8, 3, 4, 3, 2, 4, 7, 6, 6,
7, 8, 4, 6, 8, 3, 8, 8, 6, 3, 4, 6, 3, 6, 7, 3, 4, 6, 7, 7, 4, 8, 3, 3, 9, 8, 8, 4, 3, 2,
4, 5, 7, 6, 7, 8, 4, 6, 3, 5, 5, 2, 6, 9, 4, 6, 5, 6, 7, 5, 4, 6! 5, 2, 6, 2, 6, 5, 9, 5,
2? 6, 9, 6, 2, 6, 5, 4, 7, 5, 5, 4, 5, 2, 5, 2, 6, 4, 6, 2, 4, 5, 2, 7, 2, 2, 7, 7, 4, 2,
5, 5, 2, 9, 2, 4, 5, 2, 6! 4, 2, 2, 6, 5, 4, 2, 5, 7, 4, 5, 2, 5, 2, 6, 2, 6, 5, 4, 5, 2,
7, 2, 2, 7, 7, 4, 2, 5, 5, 2, 2, 2, 4, 5, 2! 7, 2, 2, 7, 7, 4, 2, 5, 5, 2, 2, 2, 4, 5, 2,
4, 7, 2, 2, 7, 2, 4, 6, 5, 5, 5, 2, 6, 5, 4, 6, 5, 6, 7, 5, 4! 4, 3, 2, 4, 3, 3, 6, 3, 8,
4! 6, 3, 3, 3, 8, 6, 3, 9, 6, 3, 6, 6, 3, 4, 6, 5, 3, 5, 3! 2, 2, 3, 3, 2, 6, 3, 4, 2, 5,
6, 3, 8, 3, 2, 6, 3, 4, 3? 5, 6, 8, 3? 5, 3, 6, 3, 5, 8, 6, 2, 6, 3, 4, 5, 8, 3, 8, 2, 3,

4, 8, 3, 3, 2, 8! 3, 3, 4, 8, 3, 3, 2, 8, 3, 4, 3, 2, 4, 7, 6, 6, 7, 8, 4, 6, 8, 3, 8, 8,
6, 3, 4, 6, 3! 2, 2, 7, 7, 4, 2, 5, 5, 2, 9, 2, 4, 5, 2, 6! 4, 2, 2, 6, 5, 4, 2, 5, 7, 4,
5, 2, 5, 2, 6, 2, 6, 5, 4, 5, 2, 7, 2, 2, 7, 7, 4, 2, 5, 5, 2, 2, 2, 4, 5, 2! 7, 2, 2, 7,
7, 4, 2, 5, 5, 2, 2, 2, 4, 5, 2, 4, 7, 2, 2, 7, 2, 4, 6, 5, 5, 5, 2, 6, 5, 4, 6, 5, 6, 7,
5, 4! 6, 5, 5, 5, 7! 6, 4, 5, 2, 2, 6, 7, 4, 2, 5, 6, 5, 2, 6! 2, 6, 5, 4, 5? 5, 7, 6, 5,
5, 2, 6, 2, 6, 5, 4, 5, 2, 7, 2, 2, 7, 7, 4, 2, 5, 9, 2, 2, 2, 4, 5, 2, 4, 5, 5, 6, 5, 2,
4, 6, 5, 5, 5, 2! 4, 5, 2, 4, 5, 5, 6, 5! 5, 6, 8, 3? 5, 5, 6, 5, 5, 2, 6, 2, 6, 3, 4, 5,
8, 3, 8, 2, 3, 3, 4, 8, 3, 9, 2, 8, 8, 4, 3, 2, 4, 3, 3, 6, 3, 8, 4, 6, 3, 3, 3, 8! 4, 3,
2, 4, 3, 3, 6, 3, 8, 4, 6, 3! 5, 6, 8, 3? 5, 6, 8, 3? 5, 6, 8, 3! 4, 2, 2, 6, 5, 4, 2, 5,
7, 4, 5, 2, 5, 2, 6, 2, 6, 5, 4, 5, 2, 7, 2, 2, 7, 4, 5, 2, 4, 6, 3, 5, 8, 6, 2, 6, 3, 4,
5, 8, 7, 8, 2, 7, 7, 4, 8, 3, 3, 2, 8! 6, 5, 5, 5, 7! 6, 4, 5, 2, 2, 6, 7, 4, 2, 5, 6, 5,
2, 6! 2, 6, 5, 4, 5? 5, 7, 6, 5, 5, 2, 6, 2, 6, 5, 4, 5, 2, 7, 2, 2, 7, 7, 4, 2, 5, 9, 2,
2, 2, 4, 5, 2, 4! 5, 6, 8, 3? 5, 5, 6, 5, 2, 4, 6, 3, 6, 7, 3, 4, 6, 7, 7, 4, 8, 3, 3, 9,
8, 8, 4, 3, 2, 4, 5, 7, 6, 7, 8, 4, 6, 3, 5, 5, 2, 6, 9, 4, 6, 5, 6, 7, 5, 4, 6! 5, 2, 6,
2, 6, 5, 9, 5, 2? 6, 9, 6, 2, 6, 5, 4, 7, 5, 5, 4, 5, 2, 5, 2, 6, 4, 6, 2, 4, 5, 2, 7, 2,
2, 7, 7, 4, 2, 5, 5, 2, 9, 2, 4, 5, 2, 6! 4, 2, 2, 6, 5, 4, 2, 5, 7, 4, 5, 2, 5, 2, 6, 2,
6, 5, 4, 5, 2, 7, 2, 2, 7, 7, 4, 2, 5, 5, 2, 2, 2, 4, 5, 2! 7, 2, 2, 7, 7, 4, 2, 5, 5, 2,
2, 2, 4, 5, 2, 4, 7, 2, 2, 7, 2, 4, 6, 5, 5, 5, 2, 6, 5, 4, 6, 5, 6, 7, 5, 4! 6, 5, 5, 5,
7! 6, 4, 5, 2, 2, 6, 7, 4, 2, 5, 6, 5, 2, 6! 2, 6, 5, 4, 5? 5, 7, 6, 5, 5, 2, 6, 2, 6, 5,
4, 5, 2, 7, 2, 2, 7, 7, 4, 2, 5, 9, 2, 2, 2, 4, 5, 2, 4! 5, 6, 8, 3? 5, 5, 6, 5, 2, 4, 6,
5, 5, 5, 2! 4, 5, 2, 4, 5, 5, 6, 5! 2, 5, 5, 2, 9, 2, 4, 5, 2, 6! 4, 2, 2, 6, 5, 4, 2!.5,
5, 6, 5, 5, 2, 6, 2, 6, 3, 4, 5, 8, 3, 8, 2, 3, 3, 4, 8, 3, 9, 2, 8, 8, 4, 3, 2, 4, 3, 3,
6, 3, 8, 4, 6, 3, 3, 3, 8! 4, 3, 2, 4, 3, 3, 6, 3, 8, 4! 6, 3, 3, 3, 8, 6, 3, 9, 6, 3, 6,
6, 3, 4, 6, 5, 3, 5, 3! 2, 2, 3, 3, 2, 6, 3, 4, 2, 5, 6, 3, 8, 3, 2, 6, 3, 4, 3? 5, 6, 8,
3? 5, 3, 6, 3, 5, 8, 6, 2, 6, 3, 4, 5, 8, 3, 8, 2, 3, 3, 4, 8, 3, 3, 2, 8! 2, 7, 2, 4, 6,
5, 5, 5, 2, 6, 5, 4, 6, 5, 6, 7, 5, 4! 6, 5, 5, 5, 7! 6, 4, 5, 2, 2, 6, 7, 4, 2, 5, 6, 5,
2, 6! 2, 6, 5, 4, 5? 5, 7, 6, 5, 5, 2, 6, 2, 6, 5, 4, 5, 2, 7, 2, 2, 7, 7, 4, 2, 5, 9, 2,
2, 2, 4, 5, 2, 4, 5, 5, 6, 5, 2, 4, 6, 5, 5, 5, 2! 4, 5, 2, 4, 5, 5, 6, 5! 5, 6, 8, 3? 5,
5, 6, 5, 5, 2, 6, 2, 6, 3, 4, 5, 8, 3, 8, 2, 3, 3, 4, 8, 3, 9, 2, 8, 8, 4, 3, 2, 4, 3, 4,
6, 5, 5, 5, 2! 4, 5, 2, 4, 5, 5, 6, 5! 6, 5, 4, 5? 4, 5? 5, 5, 6, 5, 5, 2, 6, 2, 6, 3, 4,
5, 8, 3, 8, 2, 3, 3, 4, 8, 3, 9, 2, 8, 8, 4, 3, 2, 4, 3, 3, 6, 3, 8, 4, 6, 3, 3, 3, 8! 4,
3, 2, 4, 3, 3, 6, 3, 8, 4! 6, 3, 3, 3, 6, 7, 4, 2, 5, 6, 3, 8, 7, 2, 6, 3, 4, 3? 5, 7, 6,
3, 5, 8, 6, 2, 6, 3, 4, 5, 8, 7, 8, 2, 7, 7, 4, 8, 3, 3, 2, 8! 7, 7, 4, 8, 3, 3, 2, 8, 3,
4, 3, 2, 4, 7, 6, 6, 7, 8, 4, 6, 8, 3, 8, 8, 6, 3, 4, 6, 3, 6, 7, 3, 4, 6, 7, 7, 4, 8, 3,
3, 9, 8, 8, 4, 3, 2, 4, 5, 7, 6, 7, 8, 4, 6, 3, 5, 5, 2, 6, 9, 4, 6, 5, 6, 7, 5, 4, 6! 5,

2, 6, 2, 6, 5, 9, 5, 2? 6, 9, 6, 2, 6, 5, 4, 7, 5, 5, 4, 5, 2, 5, 2, 6, 4, 6, 2, 4, 5, 2, 7,
2, 2, 7, 7, 4, 2, 5, 5, 2, 9, 2, 4, 5, 2, 6! 4, 2, 2, 6, 5, 4, 2, 5, 7, 4, 5, 2, 5, 2, 6, 2,
6, 5, 4, 5, 2, 7, 2, 2, 7, 7, 4, 2, 5, 5, 2, 2, 2, 4, 5, 2! 7, 2, 2, 7, 7, 4, 2, 5, 5, 2, 2,
2, 4, 5, 2, 4, 7, 2, 2, 7, 2, 4, 6, 5, 5, 5, 2, 6, 5, 4, 6, 5, 6, 7, 5, 4! 6, 5, 5, 5, 7! 6,
4, 5, 2, 2, 6, 7, 4, 2, 5, 6, 5, 2, 6! 2, 6, 5, 4, 5? 5, 7, 6, 5, 5, 2, 6, 2, 6, 5, 4, 5, 2,
7, 2, 2, 7, 7, 4, 2, 5, 9, 2, 2, 2, 4, 5, 2, 4! 5, 6, 8, 3? 5, 5, 6, 5, 2, 4, 6, 5, 5, 5, 2!
4, 5, 2, 4, 5, 5, 6, 5! 8, 6, 3, 9, 6, 3, 6, 6, 3, 4, 6, 5, 3, 5, 3, 2, 2, 3, 3, 2, 6, 3, 4,
2, 5, 6, 3, 8, 3, 2, 6, 3, 4, 3? 5, 6, 8, 3? 5, 3, 6, 3, 5, 8, 6, 2, 6, 3, 4, 5, 8, 3, 8,
2, 3, 3, 4, 8, 3, 3, 2, 8! 3, 3, 4, 8, 3, 3, 2, 8, 3, 4, 3, 2, 4, 7, 6, 6, 7, 8, 4, 6, 8, 3,
8, 8, 6, 3, 4, 6, 3! 2, 2, 7, 7, 4, 6, 7, 4, 2, 5, 6, 3, 8, 7, 2, 6, 3, 4, 3? 5, 7, 6, 3, 5,
8, 6, 2, 6, 3, 4, 5, 8, 7, 8, 2, 7, 7, 4, 8, 3, 3, 2, 8! 7, 7, 4, 8, 3, 3, 2, 8, 3, 4, 3, 2,
4, 7, 6, 6, 7, 8, 4, 6, 8, 3, 8, 8, 6, 3, 4, 6, 3, 6, 7, 3, 4, 6, 7, 7, 4, 8, 3, 3, 9, 8, 8,
4, 3, 2, 4, 5, 7, 6, 7, 8, 4, 6, 3, 5, 5, 2, 6, 9, 4, 6, 5, 6, 7, 5, 4, 6! 5, 2, 6, 2, 6, 5,
9, 5, 2? 6, 9, 6, 2, 6, 5, 4, 7, 5, 5, 4, 5, 2, 5, 2, 6, 4, 6, 2, 4, 5, 2, 7, 2, 2, 7, 7, 4,
2, 5, 5, 2, 9, 2, 4, 5, 2, 6! 4, 2, 2, 6, 5, 4, 2, 5, 7, 4, 5, 2, 5, 2, 6, 2, 6, 5, 4, 5, 2,
7, 2, 2, 7, 7, 4, 2, 5, 5, 2, 2, 2, 4, 5, 2! 7, 2, 2, 7, 7, 4, 2, 5, 5, 2, 2, 2, 4, 5, 2, 4,
7, 2, 2, 7, 2, 4, 6, 5, 5, 5, 2, 6, 5, 4, 6, 5, 6, 7, 5, 4! 6, 5, 5, 5, 7! 6, 4, 5, 2, 2, 6,
7, 4, 2, 5, 6, 5, 2, 6! 2, 6, 5, 4, 5? 5, 7, 6, 5, 5, 2, 6, 2, 6, 5, 4, 5, 2, 7, 2, 2, 7, 7,
4, 2, 5, 9, 2, 2, 2, 4, 5, 2, 4! 5, 6, 8, 3? 5, 5, 6, 5, 2, 4, 6, 5, 5, 5, 2! 4, 5, 2, 4,
5, 5, 6, 5! 2, 5, 5, 2, 9, 2, 4, 5, 2, 6! 4, 2, 2, 6, 5, 4, 2! 5, 5, 6, 5, 5, 2, 6, 2, 6, 3,
4, 5, 8, 3, 8, 2, 3, 3, 4, 8, 3, 9, 2, 8, 8, 4, 3, 2, 4, 3, 3, 6, 3, 8, 4, 6, 3, 3, 3, 8! 4,
3, 2, 4, 3, 3, 6, 3, 8, 4! 6, 3, 3, 3, 8, 6, 3, 9, 6, 3, 6, 6, 3, 4, 6, 5, 3, 5, 3! 2, 2, 3,
3, 2, 6, 3, 4, 2, 5, 6, 3, 8, 3, 2, 6, 3, 4, 3? 5, 6, 8, 3? 5, 3, 6, 3, 5, 8, 6, 2, 6, 3,
4, 5, 8, 3, 8, 2, 3, 3, 4, 8, 3, 3, 2, 8! 2, 7, 2, 4, 6, 5, 5, 5, 2, 6, 5, 4, 6, 5, 6, 7, 5,
4! 6, 5, 5, 5, 7! 6, 4, 5, 2, 2, 6, 7, 4, 2, 5, 6, 5, 2, 6! 2, 6, 5, 4, 5? 5, 7, 6, 5, 5,
2, 6, 2, 6, 5, 4, 5, 2, 7, 2, 2, 7, 7, 4, 2, 5, 9, 2, 2, 2, 4, 5, 2, 4, 5, 5, 6, 5, 2, 4, 6,
5, 5, 5, 2! 4, 5, 2, 4, 5, 5, 6, 5! 5, 6, 8, 3? 5, 5, 6, 5, 5, 2, 6, 2, 6, 3, 4, 5, 8, 3,
8, 2, 3, 3, 4, 8, 3, 9, 2, 8, 8, 4, 3, 2, 4, 3, 3, 6, 3, 8, 4, 6, 3, 3, 3, 8! 4, 3, 2, 4, 3,
3, 6, 3, 8, 4, 6, 3! 5, 6, 8, 3? 5, 6, 8, 3? 5, 6, 8, 3! 4, 2, 2, 6, 5, 4, 2, 5, 7, 4, 5,
2 ,5, 2, 6, 2, 6, 5, 4, 5, 2, 7, 2, 2, 7, 4, 5, 2, 4, 6, 3, 5, 8, 6, 2, 6, 3, 4, 5, 8, 7, 8,
2, 7, 7, 4, 8, 3, 3, 2, 8! 7, 7, 4, 8, 3, 3, 2, 8, 3, 4, 3, 2, 4, 7, 6, 6, 7, 8, 4, 6, 8, 3,
8, 8, 6, 3, 4, 6, 3, 6, 7, 3, 4, 6, 7, 7, 4, 8, 3, 3, 9, 8, 8, 4, 3, 2, 4, 5, 7, 6, 7, 8, 4,
6, 3, 5, 5, 2, 6, 9, 4, 6, 5, 6, 7, 5, 4, 6! 5, 2, 6, 2, 6, 5, 9, 5, 2? 6, 9, 6, 2, 6, 5, 4,
5, 6, 5, 2, 4, 6, 5, 5, 5, 2, 7, 4, 2, 5, 5, 2, 2, 2, 4, 5, 2! 7, 2, 2, 7, 7, 4, 2, 5, 5, 2,
2, 2, 4, 5, 2, 4, 7, 2, 2, 7, 2, 4, 6, 5, 5, 5, 2, 6, 5, 4, 6, 5, 6, 7, 5, 4! 6, 5, 5, 5, 7!"

Program to extract digits

I wrote the following C++ program to transform the corrected OCR output into a file with sentences where the question and exclamation marks are used as sentences terminators.
#include "stdio.h"

class FileIterator
{
public:
    FileIterator() : _more(false), _ch(' '), _f(0), _line(0), _column(0) {}
    void open(const char *filename)
    {
        _f = fopen(filename, "rt");
        if (_f == 0)
            return;
        _ch = fgetc(_f);
        _more = !feof(_f);
        _line = 1;
        _column = 1;
    }
    inline char ch() { return _ch; }
    inline bool more() { return _more; }
    inline int line() { return _line; }
    inline int column() { return _column; }
    void next()
    {
        if (!_more)
            return;
        if (_ch == '\n')
        {
            _line++;
            _column = 0;
        }
        _column++;
        _ch = fgetc(_f);
        _more = !feof(_f);
    }
    ~FileIterator()
    {
        if (_f != 0)
            fclose(_f);
    }
private:
    FILE *_f;
    bool _more;
    char _ch;
    int _line;
    int _column;
};

int main(int argc, char* argv)
{
    FileIterator f;
    f.open("ELIC_cor_OCR.txt");
    char buffer[501];
    while (f.more())
    {
        int buffer_len = 0;
        char termination = '.';
        int line = f.line();
        int column = f.column();
        for (; f.more(); f.next())
        {
            if (f.ch() == '!' || f.ch() == '?')
            {
                termination = f.ch();
                f.next();
                break;
            }
            if ('2' <= f.ch() && f.ch() <= '9')
                buffer[buffer_len++] = f.ch();
        }
        buffer[buffer_len] = '\0';
        if (buffer_len == 0) break;

        printf("%s\n", buffer);
        //printf("%s at %3d.%d\n", buffer, line, column);
        //printf("%s%c at %d.%d\n", buffer, termination, line, column);
    }
}

Sorted sentences

Below the sorted sentences preceded by their frequency. There are two pairs of sequences that are very much alike. In the first pair on of the two sequences is missing a '3'. The interesting thing is that this missing '3' is between page 269 and 270, almost if it was accidently skipped. And the other pair differs in the last digit. Here is seems that a '3' was accidently changed into a '7'.

  3 2233263425638326343
  1 227742552924526
  1 2277467425638726343
  2 2552924526
  7 26545
  2 272465552654656754
  2 33483328343247667846838863463
  2 4226542
  2 422654257452526265452722745246358626345878277483328
  5 422654257452526265452722774255222452
  4 4324336384
  2 432433638463
  1 4324776784
  1 45
  7 45245565
  5 526265952
  3 536358626345838233483328
  1 53635862634583823483328
  1 55652463673467748339884324576784635526946567546
  3 55652465552
  5 5565526263458382334839288432433638463338
  1 556552626345838233483928843243465552
 17 5683
  3 576358626345878277483328
  1 5763586263458782774839288432477678463338
  4 57655262654527227742592224524
  3 5765526265452722774259222452455652465552
  1 633367425638726343
  3 6333863963663465353
  1 6333863963663465357
  2 6432267425638726343
  7 64522674256526
  1 6545
  8 65557
  1 696263473543258626345878277483328843247767846333863463673465357
  1 6962654565246555274255222452
  4 69626547554525264624527227742552924526
  6 722774255222452472272465552654656754
  4 77483328343247667846838863463673467748339884324576784635526946567546
  1 8639636634653532233263425638326343

PhoneSpell

Below the source of the PhoneSpell C++ program. The program builds a tree based on a number of files containing word lists. Next this tree is used to analyze the contents of an input file with digit sequences.

#include "stdio.h"
#include "strings.h"
#include "ctype.h"

const char* strcopy(const char* s)
{
    char* result = new char[strlen(s)+1];
    strcpy(result, s);
    return result;
}

int encode[256];

class StrList
{
public:
    StrList(const char *s, StrList* n) : str(strcopy(s)), next(n) {}
    const char* str;
    StrList* next;
    void count(int &c)
    {
        if (this == 0)
            return;
        c++;
        next->count(c);
    }
    void print(const char* sep)
    {
        if (this == 0)
            return;
        printf("%s%s", sep, str);
        next->print(sep);
    }
};

class DigitTree
{
public:
    DigitTree()
    {
        strings = 0;
        for (int i = 0; i < 8; i++)
            children[i] = 0;
    }
    DigitTree *children[8];
    StrList* strings;
    void count(int &c)
    {
        if (this == 0)
            return;
        strings->count(c);
        for (int i = 0; i < 8; i++)
            children[i]->count(c);
    }
};

void addString(char *word, StrList *&strings)
{
    if (strings == 0 || strcmp(word, strings->str) < 0)
    {
        strings = new StrList(word, strings);
        return;
    }

    if (strcmp(word, strings->str) > 0)
        addString(word, strings->next);
}

void addWord(char *word, char *s, DigitTree *&tree)
{
    if (tree == 0)
        tree = new DigitTree;

    if (*s == '\0')
    {
        addString(word, tree->strings);
        return;
    }

    if (*s == '\'')
    {
        addWord(word, s+1, tree);
        return;
    }

    int e = encode[(unsigned char)*s];
    if (e == -1)
    {
        *s = '\0';
        addString(word, tree->strings);
    }
    else
        addWord(word, s + 1, tree->children[e]);
}

void addWord(char *word, DigitTree *&tree)
{
    addWord(word, word, tree);
}

void addList(DigitTree *&tree, const char* filename)
{
    FILE *f = fopen(filename, "r");
    if (f == 0)
    {
        printf("Could not open list '%s'\n", filename);
        return;
    }

    printf("Adding %s\n", filename);
    char buffer[300];
    while (fgets(buffer, 299, f))
    {
        addWord(buffer, tree);
    }
    fclose(f);
    //printf("Done\n");
}

#define MAX_LINES 100
char lines[MAX_LINES][500];

int main(int argc, char* argv)
{
    // init encode
    {
        for (int i = 0; i < 256; i++)
            encode[i] = -1;

        int values[] = { 2, 2, 2, 3, 3, 3, 4, 4, 4, 5, 5, 5, 6, 6, 6, 7, 7, 7, 7, 8, 8, 8, 9, 9, 9, 9 };
        for (char c = 'a'; c <= 'z'; c++)
            encode[c] = encode[toupper(c)] = values[c - 'a'] - 2;
    }

    DigitTree *root_tree = 0;
    addList(root_tree, "C:\\Temp\\scowl\\final\\english-words.10");
    addList(root_tree, "C:\\Temp\\scowl\\final\\english-upper.10");
    addList(root_tree, "C:\\Temp\\scowl\\final\\english-contractions.10");
    addList(root_tree, "C:\\Temp\\scowl\\final\\american-words.10");
    addList(root_tree, "C:\\Temp\\scowl\\final\\english-words.20");
    addList(root_tree, "C:\\Temp\\scowl\\final\\american-words.20");
    addList(root_tree, "C:\\Temp\\scowl\\final\\english-words.35");
    addList(root_tree, "C:\\Temp\\scowl\\final\\english-upper.35");
    addList(root_tree, "C:\\Temp\\scowl\\final\\english-contractions.35");
    addList(root_tree, "C:\\Temp\\scowl\\final\\american-words.35");
    addList(root_tree, "C:\\Temp\\scowl\\final\\english-words.40");
    addList(root_tree, "C:\\Temp\\scowl\\final\\english-upper.40");
    addList(root_tree, "C:\\Temp\\scowl\\final\\english-contractions.40");
    addList(root_tree, "C:\\Temp\\scowl\\final\\american-words.40");

    int c = 0;
    root_tree->count(c);
    printf("Total words = %d\n\n", c);

    char buffer[501];
    FILE *f = fopen("ELICdigits.txt", "rt");
    while (fgets(buffer, 500, f))
    {
        int buffer_len = strlen(buffer);
        if (buffer_len > 0 && buffer[buffer_len-1] == '\n')
            buffer[--buffer_len] = '\0';

        printf("%s\n", buffer);
        int nr_lines = 0;
        for (int i = 0; '2' <= buffer[i] && buffer[i] <= '9'; i++)
        {
            DigitTree* tree = root_tree;
            for (int j = i; tree != 0; j++)
            {
                for (StrList *strings = tree->strings; strings != 0; strings = strings->next)
                {
                    const char *s = strings->str;
                    int len = strlen(s);

                    if (len == 1 && *s != 'I' && *s != 'a') continue;
                    if (len > 2 && s[len-1] == 's' && s[len-2] == '\'') continue;

                    // look for place to put string
                    int l;
                    for (l = 0; l < nr_lines; l++)
                        if (lines[l][i] == ' ' && (i == 0 || lines[l][i-1] == ' '))
                            break;
                    if (l == nr_lines && nr_lines < MAX_LINES)
                    {
                        l = nr_lines++;
                        for (int k = 0; k < buffer_len; k++)
                            lines[l][k] = ' ';
                        lines[l][buffer_len] = '\0';
                        lines[l][buffer_len+1] = '\0';
                        lines[l][buffer_len+2] = '\0';
                    }
                    if (l < nr_lines)
                    {
                        for (int k = i; *s != '\0'; k++, s++)
                            lines[l][k] = *s;
                    }
                }
                if (buffer[j] < '2' || buffer[j] > '9')
                    break;

                tree = tree->children[buffer[j]-'2'];
            }
        }
        for (int l = 0; l < nr_lines; l++)
            printf("%s\n", lines[l]);
        printf("\n");
    }
}

Results

Below the output of the PhoneSpell C++ program (mention in my online diary on August 4, 2011) on the input of all the different digit sequences found in the book. I have underlined some of the words, which I think belong to original text that was converted to phone spelling with digits.

43556
I  Ln
I'd
he
id
if
gel
he'll
hell
hello
 elk

4748732559968
I I Rd  kW Mt
gs ts a   yo
is up all wot
grit fall you
grits ally mu
 pi re
 sh Sec
 Sgt dally
 pit
 sit
 pits
 sits
  it
  its
   us
   vs
   use
    pea
    sea
    peak
    peal
    peck
    real
    seal
    really

4357
I Jr
I'd
he
id
if
gel
gels
help
  ks
  ls

5683
Ln
jot
lot
loud
love
 Mt
 mu
 mud


33284
Dec I
Feb
debt
feat
death
debug
 eat
 fat
  a
  at
  Aug
  bug
   uh

696263473543258626345878277483328843247767846333863463673465357
my a eh  kid Jun me just Sq tee tug a Sq St me duo I do eh kelp
ow Co I  lid jumbo I ts a pi Dec uh Ci sm uh Fed me me Rd OK Jr
ox am gs lie  to of lust Sr  Feb the I so tin  fun I'm re old
own me Rd I a um meg up As I debt I ah pop I fed of of peg   ks
 yo of re I'd  ma eh us Cs it eat I'd Sr rug fee meg em dim  ls
  ma dip  he   man   vs as I've  tie gs Mr I'm dune men din
  man is  id   mane   St pr  feat he is Ms go  fume odor I
  mane    if   name   pub sh debut fag sop h'm fund  dos I'm
  name    heal oboe   qua Sgt fat id  pr ruin   to go Mr go
  oboe    heck  a     rub pit  a vie  rs ruined um h'm rein
   an ire   alto      sub sit  at if  pro time  toe  for h'm
   and          Co    pubs hue but dais Os ho     eh dope
   cod          am    rubs it'd    fags ms in     dim Ms ho
    meg         an    star guff    fair or God    din Os in
    megs        and   subs hued    fairs  vine    ego ms ink
     dis        cod   stars    cut  ch port of    fin or gold
     fir               tap huff     air opt odd   dime  ego
     dire              tar          airs  timed   dine  fin
     fire              taps         airport ode   find   golf
                       tarp           prop god    fine   hold
                       tars           pros hoe     ho ore
                        cs            sport off    in    hole
                        Apr            post need   God
                        ass            sort  deft  god
                        arrive             hoed    hoe
                        arrived              feet  info
                         rs                          dose
                         spit                        ford
                         spite                       fore
                         spited                      forego
                          rite
                          site
                          sited

6432267425638726343
mi a Mr a me pa eh
oh ban I Ln ts me
 I cam ha of pan I
 I'd Ms clod ram I'd
 he a pi loft a did
 id Co gal fur of
 if am halo up meg
  dab sh lofts  die
  ebb rib met Co he
   can  aloft am id
   bans   net an if
   camp   nets
   cans     us
    an      vs
    amp      ran
    bop      pane
    cop      same
     Os      sand
     ms      sane
     or       and
      sic     cod
      pick
      sick

5763586263458782774839288432477678463338
Jr flu a eh ts a pi ex tug a Sq St me
ks  Jun me just Sq vex  uh Ci sm uh Fed
ls  jumbo I up As I  watt fag so tin
 sm  to of lust Sr    a the I pop I fed
 so  um meg us Cs it  at I ah sop I'm
 pod  ma    vs as I've  tie gs Mr go
 rod  man    St pr    but dais Ms h'm
 roe  mane   pub sh   cut fags Os ho
 sod  name   qua Sgt    vie is ms in
 smelt Co    rub pit     I'd Sr rug fee
  me  oboe   sub sit     he  pr ruin
  of   am    pubs hue    id  rs ruined
  melt an    rubs it'd   if  pro time
       and   star         fair or God
       cod   subs         fairs  vine
             stars         ch port of
              tap          air opt odd
              tar          airs  timed
              taps         airport ode
              tarp           prop god
              tars           pros hoe
               cs            sport off
               Apr            post need
               ass            sort  deft
               arrive             hoed
                rs                  feet
                spit
                spite
                 rite
                 site

4324776784
I a Sq St
I'd Sr rug
he I sm uh
id gs Mr I
if is Ms
 fag so
 dais Os
 fags ms
 fair or
 fairs
  Ci pop
  ah sop
  ch port
  air opt
  airs
  airport
    pr
    rs
    pro
    prop
    pros
    sport
     post
     sort

6333863963663465357
me duo yo no I kelp
of fun woe me OK Jr
odd to  me of old
ode um  of meg   ks
off toe men eh   ls
need me memo I'm
 Fed of neon go
 fed mew do dim
 fee new em din
 deft ex don h'm
 feet    eon ho
   dune  dome
   fume  done
   fund  fond
         food
          on in
          mod
          nod
          one
            ego
            fin
             ink
             gold
             golf
             hold
             hole

576358626345878277483328
Jr flu a eh ts a pi Dec
ks  Jun me just Sq tee
ls  jumbo I up As I Feb
 sm  to of lust Sr  debt
 so  um meg us Cs it eat
 pod  ma    vs as I've
 rod  man    St pr  feat
 roe  mane   pub sh  fat
 sod  name   qua Sgt  a
 smelt Co    rub pit  at
  me  oboe   sub sit
  of   am    pubs hue
  melt an    rubs it'd
       and   star guff
       cod   subs hued
             stars
              tap huff
              tar
              taps
              tarp
              tars
               cs
               Apr
               ass
               arrive
               arrived
                rs
                spit
                spite
                spited
                 rite
                 site
                 sited

77483328343247667846838863463673467748339884324576784635526946567546
Sq tee  eh a sm St Mt tune me Rd Mr I dew uh a Jr St me Jan I Ln kin
Sr  Dec did I no uh vet me of re Ms it ex the I sm uh elk my OK Pl
pr  Feb die gs Mr I    to I do eh Sq tee tug Ci so tin  jam I'm ski
rs  debt I Ci on tin   um I'm peg Sr  few tie  ks rug    a who Mr I
spit eat I'd so rug    toe men dim pi fez vie  ls ruin   Co go Ms I'm
spite a  he is Ms I'm   of odor I pr       I ah pop I    am h'm skim
spited   id iron thou   meg em din sh      I'd  sop I'm  an ho Os go
 pi feat if irons go     eh dos I'm I've   he   port of  any  lop h'm
 sh  fat  fag mop h'm    dim Mr go Sgt     id   post     bow   ms ho
 Sgt date dais Os ho     din Ms h'm hue    if   sort     box   or in
 pit eave fags ms in     ego Os ho pit      fag  Mr go   boy    skin
 sit fate fair or gnu    fin ms in sit      fail Ms h'm  cow    slim
 rite at   ah nor got    dime rein rite     fails  time  cox
 site Ave  ch most mu    dine reins it'd     ch  Os ho   coy
 sited     air opt mud   find  ego site      ail ms in    ow
  I   ate  bison tint    fine  fin sited     ails  vine   ox
  it  bud    son  hot     go or hop guff         or God    win
  I've       pomp hove    h'm  dims hued         opt       wink
  hue cue    poop         ho ore Os huff            god     in
  it'd       poor         in   dins huffy           hoe     ink
  guff       romp         God  egos
  hued       sons         god  fins
  huff       roost        hoe   imp
      budge  snort        info  ins
              north         for hops
                ruin        dope ms
                stint       dose or
                quintet     ford Mrs
                            fore  rs
                            forego
                                imps
                                  spit
                                  spite
                                  spited

526265952
Jan OK  a
jam  kW
jamb
lamb
 a a
 Co
 am
 an
 boa
 bob
 cob
  ma
  man
  manly
   Co
   am
   an

69626547554525264624527227742552924526
my a lip  I a a I a jar a pi  jaw I a
ow Co I     clam ma lap As I  jay  Jan
ox am gs    clan nag a a Sq a law  jam
own OK Pl   clang Ci As Cs ha lax   Co
 yo   is     Jan mail pa Sr all wag am
  ma  irk    jam nail sac sh  lay   an
  man         Co  ah Cs as gal a a
   an         am  ch as cs gall wail
              an  ail sacs hall  Ci
              bog  I cs Apr allay
              cog    arc pr    ax
              coin   bra rs    by
               mi    Arab rib    ah
               oh    crab sic    ch
               ohm   Arabs       ail
                I'm  crabs
                go    scar
                h'm   scars
                ho     bar
                in     cap
                Inc    car
                gob    bars
                inch   bass
                       caps
                       carp
                       cars
                        ass
                         prick
                          pick
                          sick
                          shall

422654257452526265452722774255222452
I a lib pi a a a  I a a Sq a jab I a
ha OK a sh clam OK jar a pi  lab
gab kick I clan    lap As I   a a
 a  lick    Jan     As Cs ha  baa
 ban I Jr   jam     Cs as gal cab
 cam ha     jamb    as cs gall a
 can gal    lamb    cs Apr all bag
 bank  ks    Co     arc Sr     bail
  Co gals    am     bra pr      Ci
  am   ls    an     Arab sh     ah
  an         boa    crab rib    ch
  colic      bob    Arabs hall  ail
    kicks    cob    crabs
    licks     ma     pa rs
              man    sac sic
               Co    sacs
               am    scar
               an    scars
                      bar
                      cap
                      car
                      bars
                      bass
                      caps
                      carp
                      cars
                       ass
                        prick
                         pick
                         sick
                         shall

722774255222452472272465552654656754
pa Sq a jab I a pa pa OK Jan I Ln  I
sac pi  lab  lag a rag   jam I'm Pl
sacs I   a a lags a a     a kin Mr
scar ha  baa lair As I    Co go Ms
scars all a   Ci bar I'm  am h'm ski
 a Sr    cab  ah cap go   an ho Os
 bar gal  bag ch car h'm   OK OK
 cap gall bail I barb       kink
 car hall  Ci air Cs ho     link
 bars      ah circa Ci       in ms
 bass      ch  gs as in      ink
 caps      ail is cs ink       lop
 carp          grab ah          or
 cars          grabs
  a sh          sac ch
  As            sacs
  Cs            scar
  as              arc
  cs              bra
  Apr             arch
  ass             brag
   pr             crag
   rs             brain
   prick           sag
    rib            pain
    sic            rain
    pick            ago
    sick            aim
    shall           bin

4324336384
I a den uh
I'd fen  I
he I do
id I'd dug
if he me
 fag em
  Ci doe
  ah end
  ch foe
  age of
  aid met
  bid net
  aged
  aide
  bide
  chef
   id
   if
   gee
   he'd
    fend

6333863963663465353
me duo yo no I  eke
of fun woe me OK
odd to  me of old
ode um  of meg  elf
off toe men eh
need me memo I'm
 Fed of neon go
 fed mew do dim
 fee new em din
 deft ex don h'm
 feet    eon ho
   dune  dome
   fume  done
   fund  fond
         food
          on in
          mod
          nod
          one
            ego
            fin
             ink
             gold
             golf
             hold
             hole

2233263425638326343
a Dec eh Ln tea eh
ace a fib me dam I
bad Co I loft a did
cad am ha of fan I'd
caf an gal dud me
aced me a met Co he
bade of clod dame
 a dam halo team id
 ad and aloft am if
 be cod   net an
 add meg  mete of
 bed  dial due meg
 bee  dick eve  die
  Feb        fame
  dean        and
   fan        cod
   dame
   fame

536358626345838233483328
Leo Jun me  vet edit eat
ken jumbo I  dub eh Dec
lend to of   evade tee
 do  um meg   tad I Feb
 em   ma eh    a fit fat
 doe  man glue ad it  a
 end  mane     be I've
 foe  name     add  debt
  me  oboe     bed  feat
  of   a       bee    at
  melt Co      befit
   flu am       edited
       an        dive
       and       five
       cod       dived
                  hue
                  it'd
                  guff
                  hued
                  huff

33483328343247667846838863463
edit eat I a sm St Mt tune me
edited  eh Ci no uh vet me of
 eh Dec did I on tin   to I
 fit fat I'd so rug    um I'm
 dive a die gs Mr I    toe
 five at he is Ms I'm   of
 dived   id iron thou   meg
  I Feb  if irons go     eh
  it date fag mop h'm    dim
  I've    dais Os ho     din
  hue Ave fags ms in     ego
  it'd    fair or gnu    fin
  guff     ah nor got    dime
  hued     ch most mu    dine
  huff     air opt mud   find
   tee     bison tint    fine
    debt     son  hot     go
    feat     pomp hove    h'm
     eave    poop         ho
     fate    poor         in
      ate    romp         God
      bud    sons         god
      cue    roost        hoe
      budge  snort
              north
                ruin
                stint
                quintet

227742552924526
a Sq a jaw I a
bar I  jay  Jan
cap ha law  jam
car gal a a  Co
bars all wag am
bass allay   an
caps   lax
carp   lay
cars    ax
 a pi   by
 As gall wail
 Cs hall  Ci
 as       ah
 cs       ch
 Apr      ail
 ass
  Sr
  pr
  rs
  prick
   sh
   rib
   sic
   pick
   sick
   shall

65557
OK Jr
   ks
   ls


64522674256526
mi a Mr a OK
oh ban I Ln a
nil a pi   Jan
oil Co ha  jam
 I cam gal  Co
  jab sh    am
  lab rib   an
  labor
   can halo
   bans
   camp
   cans
    am
    an
    amp
    bop
    cop
     Ms
     Os
     ms
     or
      sic
      pick
      sick

26545
a  I
Co
am
an
 OK

5765526265452722774259222452455652465552
Jr llama  I a a Sq a  a a lag Ln a OK  a
ks  Jan OK jar a pi kW a I a   OK I
ls  jam    lap As I   baa jail  lag
 sm jamb    As Cs ha  cab  Ci   lain
 so lamb    Cs as gal  bag ah    Ci
 poll ma    as cs icky bail I    ah
 roll man   cs Apr      Ci ch    ch
 polka a    arc Sr      ah ail   ago
  OK a Co   bra pr      ch bill  aim
     Co     Arab sh     ail I'll bin
     am     crab rib        ilk   I'm
     an     Arabs           ill   go
     boa    crabs                 h'm
     bob     pa rs                ho
     cob     sac sic              in
       am    sacs                 ink
       an    scar
             scars
              bar
              cap
              car
              bars
              bass
              caps
              carp
              cars
               ass
                prick
                 pick
                 sick
                 picky
                 shaky

45245565
I a  Ln
 lag  OK
 jail
  Ci
  ah
  ch
  ail
  bill
   I
   I'll
   ilk
   ill

5565526263458382334839288432433638463338
 Ln Jan me  vet edit watt fag do uh Fed
 loll ma eh  dub eh ex tug a den tin
  OK a a  I  evade vex  uh Ci em time
   llama  glue a fit  a the I doe I fed
    jam of    tad I   at I ah end I'm
    jamb       ad it  but  ch foe go
    lamb       be I've  tie I'd dug fee
     Co meg    add    cut  age me h'm
     am        bed      vie he of ho
     an        bee       I'd fen vine
     boa       befit     he id met me
     bob         dive    id if net of
     cob         five    if gee  timed
      man         hue      aid method
      mane        it'd     bid    in
      name                 aged   God
      oboe                 aide   god
       Co                  bide   hoe
       am                  chef   hoed
       an                   he'd   odd
       and                   fend  ode
       cod                         off
                                   need
                                    deft
                                    feet

432433638463
I a den uh
I'd fen tin
he I do time
id I'd dug
if he me I
 fag em vine
  Ci doe I'm
  ah end go
  ch foe h'm
  age of ho
  aid met me
  bid net of
  aged   in
  aide   God
  bide   god
  chef   hoe
   id method
   if
   gee
   he'd
    fend

422654257452526265452722745246358626345878277483328
I a lib pi a a a  I a a pi a me to me just Sq tee
ha OK a sh clam OK jar a I Ci flu a eh ts a pi Dec
gab kick I clan    lap As lag  Jun of lust Sr  Feb
 a  lick    Jan     As Cs lain jumbo I up As I debt
 ban I Jr   jam     Cs as  ah   um meg us Cs it eat
 cam ha     jamb    as cs  ch    ma    vs as I've
 can gal    lamb    cs ash ago   man    St pr  feat
 bank  ks    Co     arc sh aim   mane   pub sh  fat
  Co gals    am     bra    bin   name   qua Sgt  a
  am   ls    an     Arab   bind  oboe   rub pit  at
  an         boa    crab    I     Co    sub sit
  colic      bob    Arabs   I'm   am    pubs hue
    kicks    cob    crabs   go    an    rubs it'd
    licks     ma     pa     h'm   and   star guff
              man    sac    ho    cod   subs hued
               Co    sacs   in          stars
               am    scar   God          tap huff
               an     bar   god          tar
                      cap   hoe          taps
                      car    of          tarp
                      bash   melt        tars
                      cash                cs
                      basil               Apr
                                          ass
                                          arrive
                                          arrived
                                           rs
                                           spit
                                           spite
                                           spited
                                            rite
                                            site
                                            sited

57655262654527227742592224524
Jr llama  I a a Sq a  a a lag
ks  Jan OK jar a pi kW a I a
ls  jam    lap As I   baa  Ci
 sm jamb    As Cs ha  cab  ah
 so lamb    Cs as gal  bag ch
 poll ma    as cs icky bail I
 roll man   cs Apr      Ci
 polka a    arc Sr      ah
  OK a Co   bra pr      ch
     Co     Arab sh     ail
     am     crab rib
     an     Arabs
     boa    crabs
     bob     pa rs
     cob     sac sic
       am    sacs
       an    scar
             scars
              bar
              cap
              car
              bars
              bass
              caps
              carp
              cars
               ass
                prick
                 pick
                 sick
                 picky
                 shaky

55652463673467748339884324576784635526946567546
 Ln a me Rd Mr I dew uh a Jr St me Jan I Ln kin
  OK I do eh Sq tee tug Ci sm uh elk my OK Pl
   lag em dim pi few the I so tin  jam I'm ski
   lain Mr I Sr  fez tie  ks rug    a who Mr I
    Ci dos I'm it ex vie  ls ruin   Co go Ms I'm
    ah for go sh      I ah pop I    am h'm skim
    ch dope Ms I've   I'd  sop I'm  an ho Os go
    ago Ms h'm hue    he   port of  any  lop h'm
    aim Os ho Sgt     id   post     bow   ms ho
    bin ms in pit     if   sort     box   or in
    bind re Os it'd    fag  Mr go   boy    skin
     I'm peg pr        fail Ms h'm  cow    slim
     go or hop guff    fails  time  cox
     h'm rein sit       ch  Os ho   coy
     ho ore ms hued     ail ms in    ow
     in  reins huff     ails  vine   ox
     God  din rite          or God    win
     god  ego site          opt       wink
     hoe  fin sited            god     in
     info dims huffy           hoe     ink
      of  dins
      men egos
      odor imp
       dose or
       ford Mrs
       fore  rs
       forego
          fins
           ins
           hops
           imps
             spit
             spite
             spited

2552924526
a jaw I a
all wag Co
allay  Jan
  jay  jam
  law   am
  lax   an
  lay
   a a
   ax
   by
    wail
     Ci
     ah
     ch
     ail

4226542
I a lib
ha OK a
gab  I
 a   ha
 ban
 cam
 can
 bank
  Co
  am
  an
  colic

272465552654656754
a a OK Jan I Ln  I
As I   jam I'm Pl
Cs I'm  a kin Mr
as go   Co go Ms
cs h'm  am h'm ski
arc     an ho Os
bra      OK OK
arch      kink
brag      link
crag       in ms
brain      ink
 pa          lop
 rag          or
 sag
 pain
 rain
  Ci
  ah
  ch
  ago
  aim
  bin
   ho
   in
   ink

556552626345838233483928843243465552
 Ln Jan me  vet edit watt fag I    a
 loll ma eh  dub eh ex tug a eh
  OK a a  I  evade vex  uh Ci I'm
   llama  glue a fit  a the I go
    jam of    tad I   at I ah h'm
    jamb       ad it  but  ch ho
    lamb       be I've  tie I'd
     Co meg    add    cut  age OK
     am        bed      vie he
     an        bee       I'd dim
     boa       befit     he id
     bob         dive    id if
     cob         five    if  din
      man         hue      aid
      mane        it'd     bid
      name                   ego
      oboe                   fin
       Co                     in
       am                     ink
       an
       and
       cod

6545
OK
  I

633367425638726343
me do I Ln ts me
of em ha me pa eh
odd Mr a of pan I
ode Ms clod ram I'd
off Os aloft a did
need pi loft Co he
 Fed sh lofts of
 fed rib met am id
 fee sic net an if
 deem gal fur meg
 deems   nets  die
  den halo up
  fen      us
  dens     vs
   dos      ran
   for      pane
    ms      same
    or      sand
     pick   sane
     sick    and
             cod

55652465552
 Ln a OK  a
  OK I
   lag
   lain
    Ci
    ah
    ch
    ago
    aim
    bin
     I'm
     go
     h'm
     ho
     in
     ink

8639636634653532233263425638326343
to yo no I  eke a dam I Ln tea eh
um woe me OK  dab fan ha me dam I
toe me of old ebb dame a of fan I'd
 me of meg  elf ad a eh loft a did
 of men eh  flea Dec fib met Co he
 mew do dim   face Co gal dud me
 new em din   ebbed me clod dame
  ex don I'm  faced of aloft am id
    memo go    a Feb dial due of
    neon h'm   ace am halo team if
     eon ho    bad an    net an
     dome      cad and   mete meg
     done      caf cod    eve  die
     fond      aced meg     fame
     food      bade  dick    and
      on in     be           cod
      mod       add
      nod       bed
      one       bee
        ego      dean
        fin       fame
         ink
         gold
         golf
         hold
         hole

2277467425638726343
a Sq Mr a me pa eh
bar I pi Ln ts me
cap I'm clod pan I
car go I loft a did
bars Ms aloft Co I'd
bass Os  lofts of
caps ms   of ram he
carp or   met am id
cars  sh  net an if
barrio ha nets meg
barrios    fur  die
 a pi rib   up
 As h'm     us
 Cs ho gal  vs
 as in halo  ran
 cs hop      pane
 Apr  sic    same
 ass  pick   sand
  Sr  sick   sane
  pr          and
  rs          cod
  prim
  spin
  primp
  prior
  spins
   sh
   pin
   rim
   sin
   pimp
   pins
   rims
   shop
   sins
    imp
    ins
    gosh

6962654565246555274255222452
my a kiln a OK jar a jab I a
ow Co I OK I   lap all a
ox am  Ln Ci   lash  lab
own OK   lag    a I   a a
 yo  kilo ah    As    baa
  ma     lain   Cs    cab
  man     ch    as     bag
   an     ago   cs     bail
          aim   ash     Ci
          bin   Asia    ah
           I'm  aria    ch
           go   crib    ail
           h'm  brick
           ho   crick
           in    pi
           ink   sh
                 rib
                 sic
                 pick
                 sick
                 shall
                  ha
                  gal
                  gall
                  hall

84474766825653
uh pi no a OK
ugh I on  Ln
this sm   joke
 I sh Nov  old
 hi gs Mt
 hip so
 his son
  I is mu
  gs root
  is snot
  grip ova
  iris muck
   pis oval
   rip
   sip
   sir
    iron
     soot
      not
      nova

Program to analyze repeating sequences in digits

I developed (till August 15, 2011) the following program to analyze repeating sequences in the digits in order to discover how the sequence was produced. The output can be found in the next section. (The version shown here, only has some small changes compared to the version of August 4, 2011.)
#include "stdio.h"
#include "strings.h"
#include "math.h"

const char* strcopy(const char* s)
{
    char* result = new char[strlen(s)+1];
    strcpy(result, s);
    return result;
}

class FileIterator
{
public:
    FileIterator() : _more(false), _ch(' '), _f(0), _line(0), _column(0) {}
    void open(const char *filename)
    {
        _f = fopen(filename, "rt");
        if (_f == 0)
            return;
        _ch = fgetc(_f);
        _more = !feof(_f);
        _line = 1;
        _column = 1;
    }
    inline char ch() { return _ch; }
    inline bool more() { return _more; }
    inline int line() { return _line; }
    inline int column() { return _column; }
    void next()
    {
        if (!_more)
            return;
        if (_ch == '\n')
        {
            _line++;
            _column = 0;
        }
        _column++;
        _ch = fgetc(_f);
        _more = !feof(_f);
    }
    ~FileIterator()
    {
        if (_f != 0)
            fclose(_f);
    }
private:
    FILE *_f;
    bool _more;
    char _ch;
    int _line;
    int _column;
};

const char *lines[200];
char terminators[200];
int nr_lines;
char eqlines[200][200];

class Score
{
public:
    int len1;
    int len2;
    int max_sum;
    int o_max;
    int percent;
    bool partial;
    bool at_begin;
    bool at_end;
    int first_i1;
    int last_i1;
    int first_i2;
    int last_i2;
    int match;
    int mutations;

    Score(const char *s1, const char* s2) : _s1(s1), _s2(s2)
    {
        len1 = strlen(s1);
        len2 = strlen(s2);

        max_sum = 0;
        o_max = 0;
        for (int o = -len1; o < len2; o++)
        {
            int sum = 0;
            int i1, i2;
            bool first = false;
            bool last = false;
            for (i1 = o < 0 ? -o : 0, i2 = o > 0 ? o : 0; i1 < len1 && i2 < len2; i1++, i2++)
                if (s1[i1] == s2[i2])
                {
                    sum++;
                    if (i1 == 0 && i2 == 0)
                        first = true;
                    if (i1 == len1-1 && i2 == len2-1)
                        last = true;
                }

            if (sum > max_sum)
            {
                o_max = o;
                max_sum = sum;
                at_begin = first;
                at_end = last;
            }
        }
        partial = len2 > len1;

        if (max_sum == len1)
        {
            percent = 100;
        }
        else
        {
            percent = (int)floor(0.5 + 100 * (double)max_sum/(double)len1);
            if (percent > 99)
                percent = 99;
            if (percent < 0)
                percent = 0;
        }

        first_i1 = -1;
        mutations = 0;
        int i1, i2;
        int mut = 0;
        match = 0;
        for (i1 = o_max < 0 ? -o_max : 0, i2 = o_max > 0 ? o_max : 0; i1 < len1 && i2 < len2; i1++, i2++)
            if (_s1[i1] == _s2[i2])
            {
                match++;
                if (first_i1 == -1)
                {
                    first_i1 = i1;
                    first_i2 = i2;
                    mut = 0;
                }
                last_i1 = i1;
                last_i2 = i2;
                mutations = mut;
            }
            else
                mut++;
        if (first_i1 != -1)
        {
            while (   (first_i1 == 2 && first_i2 == 2 && mutations+2 < match)
                   || (first_i1 == 1 && first_i2 > 0 && mutations+1 < match))
            {
                mutations++;
                first_i1--;
                first_i2--;
            }
            while (   (last_i1 == len1-3 && last_i2 == len2-3 && mutations+2 < match)
                   || (last_i1 == len1-2 && last_i2 < len2-1 && mutations+1 < match))
            {
                mutations++;
                last_i1++;
                last_i2++;
            }
            if (first_i1 == 0 && first_i2 == 0)
                at_begin = true;
            if (last_i1 == len1-1 && last_i2 == len2-1)
                at_end = true;
        }
    }

    bool hits(bool *hitlist)
    {
        bool result = false;
        int i1, i2;
        for (i1 = o_max < 0 ? -o_max : 0, i2 = o_max > 0 ? o_max : 0; i1 < len1 && i2 < len2; i1++, i2++)
            if (_s1[i1] == _s2[i2] && !hitlist[i1])
            {
                hitlist[i1] = true;
                result = true;
            }

        return result;
    }
    void print_mutations()
    {
        if (mutations == 1)
            printf(" with 1 mutation");
        else if (mutations > 1)
            printf(" with %d mutations", mutations);
    }

private:
    const char *_s1;
    const char *_s2;
};


int main(int argc, char* argv)
{
    FileIterator f;
    f.open("ELIC_cor_OCR.txt");
    char buffer[501];
    while (f.more())
    {
        int buffer_len = 0;
        char termination = '.';
        int line = f.line();
        int column = f.column();
        for (; f.more(); f.next())
        {
            if (f.ch() == '!' || f.ch() == '?')
            {
                termination = f.ch();
                f.next();
                break;
            }
            if ('2' <= f.ch() && f.ch() <= '9')
                buffer[buffer_len++] = f.ch();
        }
        buffer[buffer_len] = '\0';
        if (buffer_len == 0) break;

        lines[nr_lines] = strcopy(buffer);
        terminators[nr_lines] = termination;
        nr_lines++;
    }

    for (int i = 0; i < nr_lines; i++)
        for (int j = 0; j < nr_lines; j++)
            eqlines[i][j] = ' ';

    for (int i = 0; i < nr_lines; i++)
    {
        int max_percent = 0;
        int min_len = -1;
        bool max_complete = false;

        for (int j = 0; j < i; j++)
        {
            Score score(lines[i], lines[j]);

            if (score.percent >= max_percent)
            {
                if (score.percent > max_percent || score.len2 < min_len)
                    min_len = score.len2;
                max_percent = score.percent;
                if (score.percent == 100 && !score.partial)
                    max_complete = true;
            }
        }

        bool ignore[200];
        for (int j = 0; j < 200; j++)
            ignore[j] = false;

        printf("S%d = %s%c", i+1, lines[i], terminators[i]);
        if (max_percent == 100 && max_complete)
        {
            //printf("S%d", i+1);
            const char* alt_terminator = 0;
            for (int j = i-1; j >= 0; j--)
            {
                Score score(lines[i], lines[j]);
                if (   !score.partial
                    && (   score.percent == 100
                        || (score.at_begin && score.at_end && score.mutations <= 2)))
                {
                    if (terminators[i] != terminators[j])
                        alt_terminator =   terminators[i] == '!'
                                         ? "? instead of !" : "! instead of ?";
                    else
                    {
                        printf(" = S%d", j+1);
                        score.print_mutations();
                        eqlines[j][i] = score.mutations == 0 ? '*' : '+';
                    }
                }
            }
            if (alt_terminator != 0)
            {
                printf(" (with %s:)", alt_terminator);
                for (int j = i-1; j >= 0; j--)
                {
                    Score score(lines[i], lines[j]);
                    if (score.percent == 100 && !score.partial && terminators[i] != terminators[j])
                    {
                        printf(" = S%d", j+1);
                        score.print_mutations();
                        eqlines[j][i] = score.mutations == 0 ? '*' : '+';
                    }
                }
            }
            printf("\n");
        }
        else if (max_percent == 100 && !max_complete)
        {
            //printf("S%d", i+1);
            for (int j = i-1; j >= 0; j--)
                if (!ignore[j])
                {
                    Score score(lines[i], lines[j]);
                    if (score.percent == 100 && score.len2 == min_len)
                    {
                        char mode = ' ';
                        if (score.at_begin)
                        {
                            printf(" = start of S%d", j+1);
                            mode = 'S';
                        }
                        else if (score.at_end)
                        {
                            printf(" = end of S%d", j+1);
                            mode = 'E';
                        }
                        else
                        {
                            printf(" = S%d[%d-%d]", j+1, score.first_i2, score.last_i2);
                            mode = 'P';
                        }
                        eqlines[j][i] = mode;
                        bool first = true;
                        for (int k = j-1; k >= 0; k--)
                            if (eqlines[k][j] == '*')
                            {
                                printf(" %s= S%d", first ? "(" : "", k+1);
                                ignore[k] = true;
                                first = false;
                                eqlines[k][i] = mode;
                            }
                        if (!first)
                            printf(")");
                    }
                }
            printf("\n");
        }
        else
        {
            int percents[20];
            int nr_percents = 0;

            for (int j = i-1; j >= 0; j--)
                {
                    Score score(lines[i], lines[j]);
                    int percent = score.percent;
                    if (percent > 50 || percent > max_percent - 10)
                    {
                        bool done = false;
                        for (int k = 0; k < nr_percents; k++)
                            if (percent > percents[k])
                            {
                                int p = percents[k];
                                percents[k] = percent;
                                percent = p;
                            }
                            else if (percent == percents[k])
                            {
                                done = true;
                                break;
                            }
                        if (!done && nr_percents < 20)
                            percents[nr_percents++] = percent;
                    }
                }

            bool hitlist[200];
            for (int j = 0; j < 200; j++)
                hitlist[j] = false;

            bool need_sep = false;
            for (int p = 0; p < nr_percents; p++)
            {
                int percent = percents[p];

                for (int j = i-1; j >= 0; j--)
                    if (!ignore[j])
                    {
                        Score score(lines[i], lines[j]);
                        if (score.percent == percent && score.hits(hitlist))
                        {
                            char mode = ' ';
                            if (need_sep)
                                printf(",");
                            bool size_mentioned = false;
                            int size = score.last_i1 - score.first_i1 + 1;
                            if (score.first_i1 == 0 && score.last_i1 == score.len1-1)
                                ;
                            else
                            {
                                if (score.first_i1 == 0)
                                    printf(" first %d", size);
                                else if (score.last_i1 == score.len1-1)
                                    printf(" last %d", size);
                                else
                                    printf(" [%d-%d]", score.first_i1, score.last_i1);
                                size_mentioned = true;
                            }
                            printf(" = ");
                            if (score.first_i2 == 0 && score.last_i2 == score.len2-1)
                                printf("S%d", j+1);
                            else
                            {
                                if (score.first_i2 == 0)
                                {
                                    printf("start of S%d", j+1);
                                    if (score.percent >= 75)
                                        mode = 's';
                                }
                                else if (score.last_i2 == score.len2-1)
                                {
                                    printf("end of S%d", j+1);
                                    if (score.percent >= 75)
                                        mode = 'e';
                                }
                                else
                                {
                                    printf("S%d[%d-%d]", j+1, score.first_i2, score.last_i2);
                                    if (score.percent >= 75)
                                        mode = 'p';
                                }
                            }
                            if (score.percent > 90)
                                mode = score.len1 == score.len2 ? 'm' : '+';
                            eqlines[j][i] = mode;

                            score.print_mutations();
                            bool first = true;
                            for (int k = j-1; k >= 0; k--)
                            {
                                char eq = eqlines[k][j];
                                if (eq == '*' || eq == 'S' || eq == 'E' || eq == 'P')
                                {
                                    if (first)
                                        printf(" (S%d", j+1);
                                    first = false;
                                    if (eq == '*')
                                    {
                                        printf(" = S%d", k+1);
                                        eqlines[k][i] = mode;
                                    }
                                    else if (eq == 'S')
                                    {
                                        printf(" = start of S%d", k+1);
                                        eqlines[k][i] = mode == 'm' || mode == '+' ? 'S' : mode == 's' ? 's' : '+';
                                    }
                                    else if (eq == 'E')
                                    {
                                        printf(" = end of S%d", k+1);
                                        eqlines[k][i] = mode == 'm' || mode == '+' ? 'E' : mode == 'e' ? 'e' : '+';
                                    }
                                    else if (eq == 'P')
                                    {
                                        printf(" = part of S%d", k+1);
                                        eqlines[k][i] = mode == 'm' || mode == '+' ? 'P' : mode == 'p' ? 'p' : '+';
                                    }
                                    ignore[k] = true;
                                }
                            }
                            if (!first)
                                printf(")");
                            //if (newline)
                            //    printf("\n");
                            need_sep = true;
                        }
                    }
            }
            printf("\n");

        }
    }

    bool first[200];
    for (int i = 0; i < nr_lines; i++)
    {
        first[i] = true;
        for (int j = 0; j < i; j++)
            if (eqlines[j][i] == '*' && terminators[i] == terminators[j])
            {
                first[i] = false;
                break;
            }
    }

    printf("\n\n\n");
    for (int i = 0; i < nr_lines; i++)
    {
        printf("S%-3d =    ", i+1);

        int c = 0;
        for (int j = 0; j <= i; j++)
            if (eqlines[j][i] == '*' && terminators[i] != terminators[j])
                printf("&");
            else if (eqlines[j][i] == '*' && first[j])
            {
                printf("#");
                first[j] = false;
            }
            else
                printf("%c",eqlines[j][i]);
        printf("\\\n");
    }

    return 0;
}

Results analyzing repeated sequences

Below the output generated by above program (version of August 15, 2011) and with the corrections of August 14, 2011 and June 9, 2013 in the corrected OCR output. (The line for S86 has been extended with ", first 15 = end of S55 (S55 = S14)" taken from the output generated without the last corrections.)
S1 = 696263473543258626345878277483328843247767846333863463673465357!
S2 = 6432267425638726343? = start of S1 with 13 mutations
S3 = 5763586263458782774839288432477678463338! = S1[9-48] with 4 mutations
S4 = 4324776784! = S3[25-34]
S5 = 6333863963663465357! = end of S1 with 2 mutations
S6 = 6432267425638726343? = S2
S7 = 576358626345878277483328! = start of S3 with 1 mutation, = S1[9-32] with 3 mutations
S8 = 77483328343247667846838863463673467748339884324576784635526946567546! first 34 = S1[25-58] with 4 mutations
S9 = 526265952? = S7[4-12] with 4 mutations
S10 = 69626547554525264624527227742552924526! = start of S1 with 15 mutations
S11 = 422654257452526265452722774255222452! last 34 = S10[3-36] with 5 mutations, last 34 = S1[3-36] with 15 mutations
S12 = 722774255222452472272465552654656754! last 33 = S8[34-66] with 13 mutations, = S1[22-57] with 17 mutations
S13 = 4324336384! = S8[9-18] with 3 mutations
S14 = 6333863963663465353! = S5 with 1 mutation
S15 = 2233263425638326343? = S6 with 5 mutations (S6 = S2)
S16 = 5683? = S8[18-21] with 1 mutation
S17 = 536358626345838233483328! = S7 with 4 mutations
S18 = 33483328343247667846838863463! = start of S8 with 2 mutations
S19 = 227742552924526! = end of S10
S20 = 422654257452526265452722774255222452! = S11
S21 = 722774255222452472272465552654656754! = S12
S22 = 65557! = S21[22-26] with 1 mutation (S21 = S12), = end of S5 with 1 mutation
S23 = 64522674256526! first 11 = start of S6 with 1 mutation (S6 = S2), = S21[14-27] with 6 mutations (S21 = S12)
S24 = 26545? = S20[15-19] (= S11)
S25 = 5765526265452722774259222452455652465552! [4-27] = end of S20 with 1 mutation (S20 = S11), last 27 = start of S21 with 5 mutations (S21 = S12), first 36 = start of S3 with 14 mutations
S26 = 45245565! = S25[25-32]
S27 = 5683? = S16
S28 = 5565526263458382334839288432433638463338! = S3 with 9 mutations
S29 = 432433638463! = S28[25-36]
S30 = 5683? = S27 = S16
S31 = 5683? = S30 = S27 = S16
S32 = 5683! (with ? instead of !:) = S31 = S30 = S27 = S16
S33 = 422654257452526265452722745246358626345878277483328! first 28 = start of S20 with 2 mutations (S20 = S11), last 22 = end of S7
S34 = 65557! = S22
S35 = 64522674256526! = S23
S36 = 26545? = S24
S37 = 57655262654527227742592224524! = start of S25
S38 = 5683? = S31 = S30 = S27 = S16 (with ! instead of ?:) = S32
S39 = 55652463673467748339884324576784635526946567546! last 42 = end of S8, = S1[12-58] with 19 mutations
S40 = 526265952? = S9
S41 = 69626547554525264624527227742552924526! = S10
S42 = 422654257452526265452722774255222452! = S20 = S11
S43 = 722774255222452472272465552654656754! = S21 = S12
S44 = 65557! = S34 = S22
S45 = 64522674256526! = S35 = S23
S46 = 26545? = S36 = S24
S47 = 57655262654527227742592224524! = S37
S48 = 5683? = S38 = S31 = S30 = S27 = S16 (with ! instead of ?:) = S32
S49 = 55652465552! = end of S25
S50 = 45245565! = S26
S51 = 2552924526! = end of S19
S52 = 4226542! = start of S42 (= S20 = S11)
S53 = 5565526263458382334839288432433638463338! = S28
S54 = 4324336384! = S13
S55 = 6333863963663465353! = S14 = S5 with 1 mutation
S56 = 2233263425638326343? = S15
S57 = 5683? = S48 = S38 = S31 = S30 = S27 = S16 (with ! instead of ?:) = S32
S58 = 536358626345838233483328! = S17
S59 = 272465552654656754! = end of S43 (= S21 = S12)
S60 = 65557! = S44 = S34 = S22
S61 = 64522674256526! = S45 = S35 = S23
S62 = 26545? = S46 = S36 = S24
S63 = 5765526265452722774259222452455652465552! = S25
S64 = 45245565! = S50 = S26
S65 = 5683? = S57 = S48 = S38 = S31 = S30 = S27 = S16 (with ! instead of ?:) = S32
S66 = 556552626345838233483928843243465552! first 32 = start of S53 with 1 mutation (S53 = S28), first 33 = start of S63 with 14 mutations (S63 = S25)
S67 = 45245565! = S64 = S50 = S26
S68 = 6545? = end of S62 (= S46 = S36 = S24)
S69 = 45? = end of S68
S70 = 5565526263458382334839288432433638463338! = S53 = S28
S71 = 4324336384! = S54 = S13
S72 = 633367425638726343? = end of S6 with 3 mutations (S6 = S2), = end of S56 with 4 mutations (S56 = S15)
S73 = 576358626345878277483328! = S7
S74 = 77483328343247667846838863463673467748339884324576784635526946567546! = S8
S75 = 526265952? = S40 = S9
S76 = 69626547554525264624527227742552924526! = S41 = S10
S77 = 422654257452526265452722774255222452! = S42 = S20 = S11
S78 = 722774255222452472272465552654656754! = S43 = S21 = S12
S79 = 65557! = S60 = S44 = S34 = S22
S80 = 64522674256526! = S61 = S45 = S35 = S23
S81 = 26545? = S62 = S46 = S36 = S24
S82 = 57655262654527227742592224524! = S47 = S37
S83 = 5683? = S65 = S57 = S48 = S38 = S31 = S30 = S27 = S16 (with ! instead of ?:) = S32
S84 = 55652465552! = S49
S85 = 45245565! = S67 = S64 = S50 = S26
S86 = 8639636634653532233263425638326343? last 19 = S56 (S56 = S15), first 15 = end of S55 (S55 = S14)
S87 = 5683? = S83 = S65 = S57 = S48 = S38 = S31 = S30 = S27 = S16 (with ! instead of ?:) = S32
S88 = 536358626345838233483328! = S58 = S17
S89 = 33483328343247667846838863463! = S18
S90 = 2277467425638726343? = end of S86 with 5 mutations, last 14 = end of S72
S91 = 576358626345878277483328! = S73 = S7
S92 = 77483328343247667846838863463673467748339884324576784635526946567546! = S74 = S8
S93 = 526265952? = S75 = S40 = S9
S94 = 69626547554525264624527227742552924526! = S76 = S41 = S10
S95 = 422654257452526265452722774255222452! = S77 = S42 = S20 = S11
S96 = 722774255222452472272465552654656754! = S78 = S43 = S21 = S12
S97 = 65557! = S79 = S60 = S44 = S34 = S22
S98 = 64522674256526! = S80 = S61 = S45 = S35 = S23
S99 = 26545? = S81 = S62 = S46 = S36 = S24
S100 = 57655262654527227742592224524! = S82 = S47 = S37
S101 = 5683? = S87 = S83 = S65 = S57 = S48 = S38 = S31 = S30 = S27 = S16 (with ! instead of ?:) = S32
S102 = 55652465552! = S84 = S49
S103 = 45245565! = S85 = S67 = S64 = S50 = S26
S104 = 2552924526! = S51
S105 = 4226542! = S52
S106 = 5565526263458382334839288432433638463338! = S70 = S53 = S28
S107 = 4324336384! = S71 = S54 = S13
S108 = 6333863963663465353! = S55 = S14 = S5 with 1 mutation
S109 = 2233263425638326343? = S56 = S15
S110 = 5683? = S101 = S87 = S83 = S65 = S57 = S48 = S38 = S31 = S30 = S27 = S16 (with ! instead of ?:) = S32
S111 = 536358626345838233483328! = S88 = S58 = S17
S112 = 272465552654656754! = S59
S113 = 65557! = S97 = S79 = S60 = S44 = S34 = S22
S114 = 64522674256526! = S98 = S80 = S61 = S45 = S35 = S23
S115 = 26545? = S99 = S81 = S62 = S46 = S36 = S24
S116 = 5765526265452722774259222452455652465552! = S63 = S25
S117 = 45245565! = S103 = S85 = S67 = S64 = S50 = S26
S118 = 5683? = S110 = S101 = S87 = S83 = S65 = S57 = S48 = S38 = S31 = S30 = S27 = S16 (with ! instead of ?:) = S32
S119 = 5565526263458382334839288432433638463338! = S106 = S70 = S53 = S28
S120 = 432433638463! = S29
S121 = 5683? = S118 = S110 = S101 = S87 = S83 = S65 = S57 = S48 = S38 = S31 = S30 = S27 = S16 (with ! instead of ?:) = S32
S122 = 5683? = S121 = S118 = S110 = S101 = S87 = S83 = S65 = S57 = S48 = S38 = S31 = S30 = S27 = S16 (with ! instead of ?:) = S32
S123 = 5683! = S32 (with ? instead of !:) = S122 = S121 = S118 = S110 = S101 = S87 = S83 = S65 = S57 = S48 = S38 = S31 = S30 = S27 = S16
S124 = 422654257452526265452722745246358626345878277483328! = S33
S125 = 77483328343247667846838863463673467748339884324576784635526946567546! = S92 = S74 = S8
S126 = 526265952? = S93 = S75 = S40 = S9
S127 = 6962654565246555274255222452! last 25 = end of S95 with 11 mutations (S95 = S77 = S42 = S20 = S11), last 26 = S116[2-27] with 13 mutations (S116 = S63 = S25), first 25 = start of S94 with 13 mutations (S94 = S76 = S41 = S10)
S128 = 722774255222452472272465552654656754! = S96 = S78 = S43 = S21 = S12
S129 = 65557! = S113 = S97 = S79 = S60 = S44 = S34 = S22



S1   =     \
S2   =      \
S3   =    p  \
S4   =      P \
S5   =    e    \
S6   =     #    \
S7   =    p +    \
S8   =            \
S9   =             \
S10  =              \
S11  =             p \
S12  =                \
S13  =                 \
S14  =        m         \
S15  =                   \
S16  =           p        \
S17  =                     \
S18  =           +          \
S19  =             E         \
S20  =              #         \
S21  =               #         \
S22  =        e      p        p \
S23  =                           \
S24  =              P        P    \
S25  =                             \
S26  =                            P \
S27  =                   #           \
S28  =                                \
S29  =                               P \
S30  =                   *          *   \
S31  =                   *          *  * \
S32  =                   &          &  && \
S33  =                                     \
S34  =                         #            \
S35  =                          #            \
S36  =                           #            \
S37  =                            S            \
S38  =                   *          *  **&      \
S39  =           e                               \
S40  =            #                               \
S41  =             #                               \
S42  =              *        *                      \
S43  =               *        *                      \
S44  =                         *           *          \
S45  =                          *           *          \
S46  =                           *           *          \
S47  =                                        #          \
S48  =                   *          *  **&     *          \
S49  =                            E                        \
S50  =                             #                        \
S51  =                      E                                \
S52  =              S        S                     S          \
S53  =                               #                         \
S54  =                #                                         \
S55  =        +        #                                         \
S56  =                  #                                         \
S57  =                   *          *  **&     *         *         \
S58  =                    #                                         \
S59  =               E        E                     E                \
S60  =                         *           *         *                \
S61  =                          *           *         *                \
S62  =                           *           *         *                \
S63  =                            #                                      \
S64  =                             *                       *              \
S65  =                   *          *  **&     *         *        *        \
S66  =                               s                        s             \
S67  =                             *                       *             *   \
S68  =                           E           E         E               E      \
S69  =                                                                       E \
S70  =                               *                        *                 \
S71  =                *                                        *                 \
S72  =     e   e        e                                        e                \
S73  =          #                                                                  \
S74  =           #                                                                  \
S75  =            *                              *                                   \
S76  =             *                              *                                   \
S77  =              *        *                     *                                   \
S78  =               *        *                     *                                   \
S79  =                         *           *         *               *                   \
S80  =                          *           *         *               *                   \
S81  =                           *           *         *               *                   \
S82  =                                        *         *                                   \
S83  =                   *          *  **&     *         *        *       *                  \
S84  =                                                    #                                   \
S85  =                             *                       *             *  *                  \
S86  =                                                                                          \
S87  =                   *          *  **&     *         *        *       *                 *    \
S88  =                    *                                        *                              \
S89  =                     #                                                                       \
S90  =                                                                                              \
S91  =          *                                                                 *                  \
S92  =           *                                                                 *                  \
S93  =            *                              *                                  *                  \
S94  =             *                              *                                  *                  \
S95  =              *        *                     *                                  *                  \
S96  =               *        *                     *                                  *                  \
S97  =                         *           *         *               *                  *                  \
S98  =                          *           *         *               *                  *                  \
S99  =                           *           *         *               *                  *                  \
S100 =                                        *         *                                  *                  \
S101 =                   *          *  **&     *         *        *       *                 *   *              \
S102 =                                                    *                                  *                  \
S103 =                             *                       *             *  *                 *                  \
S104 =                                                      #                                                     \
S105 =                                                       #                                                     \
S106 =                               *                        *                *                                    \
S107 =                *                                        *                *                                    \
S108 =        +        *                                        *                                                     \
S109 =                  *                                        *                                                     \
S110 =                   *          *  **&     *         *        *       *                 *   *             *         \
S111 =                    *                                        *                             *                       \
S112 =                                                              #                                                     \
S113 =                         *           *         *               *                  *                 *                \
S114 =                          *           *         *               *                  *                 *                \
S115 =                           *           *         *               *                  *                 *                \
S116 =                            *                                     *                                                     \
S117 =                             *                       *             *  *                 *                 *              \
S118 =                   *          *  **&     *         *        *       *                 *   *             *        *        \
S119 =                               *                        *                *                                   *             \
S120 =                                #                                                                                           \
S121 =                   *          *  **&     *         *        *       *                 *   *             *        *       *   \
S122 =                   *          *  **&     *         *        *       *                 *   *             *        *       *  * \
S123 =                   &          &  &&#     &         &        &       &                 &   &             &        &       &  && \
S124 =                                    #                                                                                           \
S125 =           *                                                                 *                 *                                 \
S126 =            *                              *                                  *                 *                                 \
S127 =                                                                                                                                   \
S128 =               *        *                     *                                  *                 *                                \
S129 =                         *           *         *               *                  *                 *               *                \

Copy analysis

See online diary entry of Monday, August 15, 2011 for a copy analysis and entry of Tuesday, June 18, 2013 for a corrected copy analysis.

External links

I am not the only one having worked on this. Some external links:


Home