The origin of CAR and CDR in LISP

A discussion in alt.folklore.computers, that started with a question from someone:

The use of the keywords CAR and CDR to denote the head and tail of a list in LISP has always struck me as odd, (To which Iain A F Fleming replied: I have always, and still do, use car and cdr as content and forward pointer, even in C. )

According to the LISP 1.5 Programmer's Manual ISBN 0 262 13011 4, `A' and `D' stand for `address' and `decrement', as on page 36 it says:

       +-----------+-----------+
       |   add.    |   dec.    |
       +-----------+-----------+

I would conclude from this that CAR and CDR are names of cpu instructions and stand for something like:

Copy Address Register
Copy Decrement Register.

David Udin wrote:

As I recall (and it has been a long time) the IBM 7090 (and possibly the 709 before it) had a 15 bit address register and a 15 bit decrement register that could be loaded from a 36 bit word in memory (the other 6 bits were used for marking words as being atoms versus s-expressions, and for marking during garbage collection). car stood for "contents of address register" (basically meaning following the pointer in that register) and cdr stood for "contents of decrement register". If the s-expression they were applied to represented a list (and of course, an s-expression needn't be a list: it is simply a pair of pointers to other atoms/s-expressions) car got you the head of the list and cdr got you the tail.

David Udin

Charles Richmond replied to David

I thought CAR and CDR came from the IBM 701 instruction set. It seems like I had an old 7090 hardware reference manual once, and CAR and CDR were *not* mentioned.

Edward Rice replied to David

Essentially correct. There was no "address" or "decrement" register, rather index registers (three on the 7090, seven on the 7094). You could (among other things) load an index register from the left or right half of a word, the decrement or the address, with LXD (load index from decrement) and LXA (load index from address). Lisp was designed almost like a macro language, to allow fast list processing on a 7094.

Fredrick Backman replied to Charles

2nd hand info: the first LISP implementation on IBM 704 had CAR (Contents of Address Register) and CDR (Contents of Data Register).

David Ulim wrote in reply to questions by Kevin D. Quitt (italics):

On which machine? That only gives Lisp 32K for its address space,

On early IBM machines: 704, 709, 7040, 7090, 7094. Perhaps someone can tell us which was the first used for implementing lisp; I first encountered it on the 7094. Yes, it had only a 32k address space. The remaining bits were used to mark words during garbage collection and, as I recall, to indicate whether a word was an atom or s-expression.

Comment by Alan Kotok

Steve Russell

which should be completely filled after about half of lisp is loaded. Most implementations used two 16-bit words, which was awkward and dangerous because it wasn't a single instruction to move/copy/update them.

And that's why the PDP-10 was so popular for lisp, because it had 36 bit words, and therefore a 256K workspace--you could actually do something significant on it without worrying (too much). The PDP-10 also had really nifty half-word instructions which made the implementation faster and safer.

Rumor at the time had it that Alan R. Kotok, (one of?) the designer(s) of the PDP-10 was influenced by the needs of lisp implementation in designing the PDP-10.

Comment by Alan Kotok

Minsky's lab had a 256K (word) memory built for them, which was immediately exhausted by Pete Samson's lisp program for finding optimal routes through the NYC transit system (he and others were trying to find a route that would cover the entire system in under 24 hours.)

Comment by Alan Kotok

Comment by Pete Samson:
Yes, the subway program filled the 256K-word Fabri-Tek memory. To reduce the time spent garbage collecting, some of the numerical routines were written by hand in LAP (Lisp Assembly Program) to avoid consing of intermediate numerical results.

Probably just visiting each station on the subway map could have been done in less than 24 hours, but we undertook a slightly different problem, to travel over each segment of right-of-way; it took more than 25 hours (all on a single fare, of course). Fortunately, the New York subway system is open all night.

Bill Sudbrink wrote:

I thought that the D in CDR was data too. Some previous posts were saying that the D was decrement. I wasn't sure if it was a subtle troll or just a mistake. I can't find my copy of _The Little Lisper_ anywhere but I was sure it was data.

Charles Richmond wrote

If you are discussing the *origin* of CAR and CDR, please consider the following. It is a footnote on page 13 of John Allen's book Anatomy of Lisp and discusses the origin of CAR and CDR:

Note that the machine referenced was the IBM 704, *not* the 7090 or the 709.

Bill Sudbrink replied again:

OK, so it's decrement. I have never encountered decrement used like this. What does it mean?

Allan J. Baum wrote:

A possible explanation: the 704 had subtractive indexing (basereg-indexreg)

The offset (or address part) of an inst. was in the low half. The decrement field was in the high half.

In index register ops, the decr. field was used to compare/modify(by addition) index registers, while the address in the low half was used general as a branch displacement. Non index-reg ops used this field as extended opcode & indirect specifier selector

Which still begs the question of why it was called the decrement field. Index regs could be transferred from either half of the accum or memory

Ron Hunsinger replies:

(In reply to: the 704 had subtractive indexing (basereg-indexreg):)

Bingo! You've got it.

Except that the decrement didn't have to come from an index register. It could come from the decrement field of the instruction, as a literal.

The 704 and successor machines did not have clean addressing modes or a consistent instruction format, but many instructions broke the 36-bit instruction word into fields of 3:15:3:15.

3: Opcode (or opcode family in most cases, with the rest of the opcode coming from the other fields)
15: Decrement field, used in many instructions to provide a literal value to be subtracted from an effective address or a register. In other instructions, this was an extension of the opcode field.
3: Index field. The 7090 had three index registers, each of which corresponded to one bit of this field. Setting multiple bits logically ORed the registers to get the index value to be subtracted. (I could be wrong about this.) The 7094 had 4 additional index registers (hence the extra 4 in the model number), and the index field selected one of the 7 index registers (or 0 for no indexing). In some instructions, this was part of the opcode.
15: Address field, used in most instructions to specify a base address.

(In reply to the last paragraph:)

Because of how the field was used in instruction words.

There were instructions to make it easier to modify the decrement and address fields of data words, because back then self-modifying code was the norm, and nobody bothered to distinguish between instructions and data.

Email from Alan Kotok

He was killing some time seeing what references there were to himself in the Web, and came across this page. Then he wrote me an email (CC: Pete Samson, and Steve Russell) to give me some first hand info. I have added the info as comments to the reply by David Ulim.

Email from Pete Samson

As Alan had CC-ed the mail to Pete, Pete added some more comments, which I also added them as comments to the reply by Edward Rice.

Email from Steve Russell

I wrote the first implimenation of a LISP interpreter on the IBM 704 at MIT in early in 1959. I hand-compiled John McCarthy's "Universal LISP Function".

The 704 family (704, 709, 7090) had "Address" and "Decrement" fields that were 15 bits long in some of the looping instructions. There were also special load and store instructions that moved these 15-bit addresses between memory and the index regiseters ( 3 on the 704, 7 on the others )

We had devised a representation for list structure that took advantage of these instructions.

Because of an unfortunate temporary lapse of inspiration, we couldn't think of any other names for the 2 pointers in a list node than "address" and "decrement", so we called the functions CAR for "Contents of Address of Register" and CDR for "Contents of Decrement of Register".

After several months and giving a few classes in LISP, we realized that "first" and "rest" were better names, and we (John McCarthy, I and some of the rest of the AI Project) tried to get people to use them instead.

Alas, it was too late! We couldn't make it stick at all. So we have CAR and CDR.

As the 704 has 36 bit words, there were 6 bits in the list nodes that were not used. Our initial implimentation did not use them at all, but the first garbage collector, comissioned in the summer of 1959, used some of them as flags.

Atoms were indicated by having the special value of all 1's in car of the first word of the property list. All 0's was NIL, the list terminator.

We were attempting to improve on "IPL-V", (for Interpretive Processing of Lists - version 5) which ran on a 650. I believe that the 0 list terminator was used there, but I believe that the all 1's flag for atoms was original.

Hope this is enlightening.

Question by Richard Simmons to Steve Russell

Steve,

Tom Eggers (here at the University of Colorado at Colorado Springs), tells me that "you were there" when they invented lisp, and you would know the answer to this question.

In the IBM 7094, the Contents of the Decrement Register and Contents of the Address Register were thus:

      +---+---------------+---+---------------+
      |   |      CDR      |   |      CAR      |
      +---+---------------+---+---------------+

and yet lisp was invented so that (pictorially) the CAR was on the left and the CDR was on the right. Why?

Answer from Steve Russell

First a correction: The first implementations of Lisp were on the IBM 704, the vacuum-tube ancestor of the vacuum-tube 709, the ancestor of the transistorized 7090, later upgraded to the 7094. The time was early in 1959.

I believe that we started writing list structures in the "natural" (to us english-speakers) from left to right before we had fixed on implimentation details on the 704. ( In fact, I believe that IPL-V wrote them that way ).

I don't remember how we decided to use the address for the first element, but I suspect it had to do with guessing that it would be referenced more, and that there were situations where a memory cycle would be saved when the pointer was in the address.

Hope this is sufficient. Sorry its not a better story.

Links to LISP sources

History of Lisp by John McCarthy
picture of John McCarthy
Early LISP history
Lisp History, or: Origins of Misunderstandings, chapter 34 of Successful Lisp: How to Understand and Use Common Lisp by David B. Lamkins.

Back to my `life as a hacker' page.