The history of the RIM10B loader is included at the end of this document.

This program is the bootstrap loader that was used to do a "cold start" on a PDP-10 (model KA or KI) via the paper-tape reader. It reads 36-bit words from 6 consecutive 8-bit paper-tape frames, loads them into core memory, and verifies checksums. The entire program is less than 16 words long, and therefore can fit into locations 000000 through 000017 (octal) - the locations used by the accumulators when the "FM Enable" switch is off. (FM = Fast Memory = 16 accumulators, 36 bits each, implemented as flip-flops instead of core memory.)

RIM10B Loader

			RIM10B	; Causes RIM10B loader to be punched
00/   777762,,0			XWD	-16,0
01/   710600,,60	ST:	CONO	PTR,60
02/   541400,,4		ST1:	HRRI	A,RD+1
03/   710740,,10	RD:	CONSO	PTR,10
04/   254000,,3			 JRST	.-1
05/   710470,,7			DATAI	PTR,@TBL1-RD+1(A)
06/   256010,,7			XCT	     TBL1-RD+1(A)
07/   256010,,12		 XCT	     TBL2-RD+1(A)
10/   364400,,0		A:	SOJA	A,	; Magic occurs here ****
11/   312740,,16	TBL1:	CAME	CKSM,ADR
12/   270756,,1			 ADD	CKSM,1(ADR)
13/   331740,,16		SKIPL	CKSM,ADR
14/   254200,,1		TBL2:	 JRST	4,ST
15/   253700,,3		AOBJN	ADR,RD
16/   254000,,2		ADR:	JRST	ST1
17/			CKSM=ADR+1
Here is an example of a two-word program as output by RIM10B
17/   777776,,777		LOC	1000	; Set starting address
20/   201740,,3777	START:	MOVEI	17,4000-1
21/   505740,,777600		HLRI	17,-200
22/   707677,,4576				; Sum of previous 3 words
23/   254000,,1000		END	START

Analysis

RIM
When the Read-In Mode (RIM) switch is pressed on the console of a KA or KI, it sends a reset pulse down the I/O bus, sets the PC flags to zero, and executes "DATAI D,0" (where D is the device code selected by a set of 7 switches, the paper tape reader is device 104). The DATAI reads in an IOWD, which has the negative word count in the left half and starting address minus one in the right half. The CPU then repeatedly executes "BLKI D,0" until the left half of location 0 reaches zero. ("BLKI D,X" increments both halves of location X, reads in a word from device D, and stores it the address that the right half of location X now points to.)
00/ XWD -16,0
Transfer 16 octal (14 decimal) words, starting at location 1.
01/ST: CONO PTR,60
Start paper tape reader in binary mode
02/ST1: HRRI A,RD+1
Reset finite-state machine to looking for IOWD

State RD+1 = Looking for IOWD or JRST

03/RD: CONSO PTR,10
Read paper tape reader status, skip if "DONE" bit is set
04/ JRST .-1
Not set, keep looping until the bit does get set
05/ DATAI PTR,@TBL1-RD+1(A)
Index register A has RD+1, indexing TBL1-RD+1+RD+1 is TBL1+2, which is the SKIPL CKSM,ADR instruction, therefore the effective address is ADR. Store the IOWD in ADR.
06/ XCT TBL1-RD+1(A)
Same effective address, "SKIPL CKSM,ADR" loads the IOWD into accumulator CKSM, and skips next instruction because its negative.
07/ XCT TBL2-RD+1(A)
Not executed first time around. At the end of the tape, a JRST instruction will be read in instead of an IOWD. (JRST is opcode 254, which is postitive). TBL2-RD+1+RD+1 is TBL2+2, which is ADR. The JRST instruction which was just read in is executed, and that causes the PC to jump to the beginning of the program.
10/A: SOJA A,RD+1
Set the PC to RD+1, subtract one from index register A (so it now has RD in the right half, then jump to the original address (RD+1).

Note: This is a self-modifying instruction. The CPU, however, remembers the effective address that the instruction used to have.

04/ JRST .-1
Jump to location 3.

State RD+0 = Reading in data words

03/RD: CONSO PTR,10
Read paper tape reader status, skip if "DONE" bit is set
04/ JRST .-1
Not set, keep looping until the bit does get set
05/ DATAI PTR,@TBL1-RD+1(A)
Index register A has RD+0, indexing TBL1-RD+1+RD+0 is TBL1+1, which is the ADD CKSM,1(ADR) instruction, therefore the effective address is one greater than what ADR points to. Store the data in memory.
06/ XCT TBL1-RD+1(A)
Same effective address, "ADD CKSM,1(ADR)" adds the word read in to the additive checksum in accumulator CKSM.
07/ XCT TBL2-RD+1(A)
The address is TBL2-RD+1+RD+0 which is TBL2+1. That location has "AOBJN ADR,RD". Add one to both halves of accumulator ADR. If the result is still negative, loop back to RD (location 3). If non-negative, continue on at location 10.
10/A: SOJA A,RD+0
Set the PC to RD+0, subtract one from index register A (so it now has RD-1 in the right half, then jump to the original address (RD+0).

Note: This is a self-modifying instruction. The CPU, however, remembers the effective address that the instruction used to have.

State RD-1 = Reading in checksum

03/RD: CONSO PTR,10
Read paper tape reader status, skip if "DONE" bit is set
04/ JRST .-1
Not set, keep looping until the bit does get set
05/ DATAI PTR,@TBL1-RD+1(A)
Index register A has RD-1, indexing TBL1-RD+1+RD-1 is TBL1+0, which is the CAME CKSM,ADR instruction, therefore the effective address is ADR. Store the expected checksum in ADR.
06/ XCT TBL1-RD+1(A)
Same effective address, "CAME CKSM,ADR" compares the calculated checksum in accumulator CKSM with the expected checksum stored in memory location ADR. Skip the next instruction if they're equal.
07/ XCT TBL2-RD+1(A)
The address is TBL2-RD+1+RD-1 which is TBL2+0. That location has "JRST 4,ST" which is a HALT instruction. If the previous compare instruction failed, set the program counter to ST and halt the CPU. This allows the operator to back up the paper tape reader and try again. If the CAME succeeded, this HALT is not executed.
10/A: SOJA A,RD+1
Set the PC to RD+1, subtract one from index register A (so it now has RD-2 in the right half, then jump to the original address (RD+1). This jumps to location ST1, which resets the finite-state machine.

Dispatch table for finite-state machine

11/TBL1: CAME CKSM,ADR
In state RD-1, read expected checksum into ADR, then compare calculated checksum with expected checksum.
12/ ADD CKSM,1(ADR)
In state RD+0, store data word into memory, then add data word into running checksum.
13/ SKIPL CKSM,ADR
In state RD+1, store IOWD or JRST in ADR, then load that word into accumulator CKSM and skip if the word is negative.
14/TBL2: JRST 4,ST
If the checksum comparison fails, halt the CPU, with ST in the PC.
15/ AOBJN ADR,RD
In state RD+0, increment the IOWD and jump to RD if more to go.
16/ADR: JRST ST1
This is the last word of the RIM10B loader. When the hardware read-in process is completed, this instruction is executed to start the program.
17/CKSM=ADR+1
The additive checksum is calculated using this accumulator.

Storing bootstrap in core memory

The FM ENB switch enables Fast Memory, causing references to the accumulators (locations 00 through 17) to go to RAM instead of core memory. When FM ENB is off, the above bootstrap can be toggled into locations 01 through 16. (Locations 00 and 17 need not be initialized.)


Notes

History of the RIM10B loader

From: Bob Clements
To: inwap@best.com
Subject: Re: RIM10B bootstrap loader for the PDP-10
Newsgroups: alt.folklore.computers,alt.sys.pdp10

Hi, Joe,

As I said in a previous article about RUNOFF, I'm avoiding USENET postings until I get a little spare time to fake my address to avoid email spam. Feel free to post this if you omit my email address except in the form I put on the last line.

>>The true magic of the PDP-10 instruction set was in the RIM-10B loader.
>>The program is 14 instructions long and uses 2 accumlators for data.
>>It reads in 36-bit words from an 8-bit paper tape reader, deposits the
>>data in memory, and verifies the checksums of the program it is loading.
Add "and can be restarted on a block boundary in case of a checksum error", which was another requirement of the task.
>Somewhere I read the story of how it was created; a really bright hacker
>was tricked into creating it when his colleagues kept saying that it
>couldn't be done.  Anyone have the details?
>	-Joe
No trickery. Just the challenge of doing it.

I think I posted this some years ago. If anyone has an old copy they can compare my current fading memory to the old version. Here's how I remember it now.

This loader was written in an all-out brainstorming effort. It happened at DEC, Maynard, building 5 rather than at TMRC or Project MAC where other such fests happened.

Somehow the challenge came up of writing a paper tape loader that would not require the use of any fixed memory locations. The idea was that any program might be loaded in pieces, and you wouldn't want to clobber any previous part with storage/code used by the loader. Also, to take dumps of a dead program, we didn't want to clobber any core. This should fit entirely in the ACs.

A few previous attempts had been done, but they all took somewhat more than sixteen words. Finally, a bunch of serious bit bummers decided to work on it and get it solved.

My memory may be wrong, but I think the group that worked on it included Alan Kotok, Tom Eggers, Dave Gross and myself, and a couple more whom I'm not so sure of. Maybe Peter Hurley? Maybe Tom Hastings?

Anyway, we came up with a LOT of ways of doing it in fifteen instructions plus the two registers to hold the checksum and the AOBJN pointer. Seventeen words in all. We considered using location 40 for the 17th word, but that didn't feel fair.

Then, out of the blue, Dave Gross came in with this wonderful hack, using two indexed XCT instructions, which was totally unlike anything else we had tried. It used thirteen instructions plus the registers, so it could fit in the ACs WITH the count word needed by the RIM hardware! Register zero was actually a spare. Most other attempts had used it for the checksum.

There was great rejoicing, and the quiet, reserved Dave Gross actually looked quite pleased with himself.

Bob Clements, K1BC, my-last-name at BBN dot COM. (w) +1 617 USE K1BC

Disclaimer

From: JCGreen@ix.netcom.com (John C Green Jr)

By the time the 1971 edition of "PDP-10 Reference Handbook" was published there had been so many questions asked by people using it as an example of good programming technique that this comment was added in the margin:

This loader is written for min-
imum size and is quite com-
plex. Do not approach it as a
simple programming example.

Note from Dave

From: "Dave Gross HLO2-2/B10, pole G13, dtn 225-4317 31-Mar-1998 1403 -0500"
Subject: RE: History of Rim10b loader
Date: Tue, 31 Mar 98 14:03:56 EST

I am the Dave Gross mentioned by Bob Clements in the History of the rim10b loader page. I don't remember Peter Hurley or Tom Hastings working on the code, but there was someone else. I'm not sure who...maybe Peter Conklin or Dave Stone ... who motivated the effort.

The problem was presented to me as a theoretical one: is it possible for a paper tape loader to fit in the register space, load data, compute and check checksums, and jump to the loaded program when done? The others came close as Bob pointed out. My challenge was to "bum" one more location out of the loader. I don't remember how I found that SOJA hack, but when the dust settled, the loader was 2 instructions shorter. Indeed, I nearly broke my arm patting myself on the back.

Then, to my surprise, we actually made use of the loader for most paper tapes. The loader was written up in the programming manuals but very few programmers could figure it out. I kept getting phone calls about that loader for years afterward - right up to the time the 10/20 line was retired.

Dave

Effective address calculation

Later versions of the "Processor Reference Manual" had this paragraph repeated twice:
PLEASE READ THIS
The calculation of E is the first step in the execution of every instruction. No other action taken by any instruction, no matter what it is, can possibly precede that calculation. There is absolutely nothing whatsoever that any instruction can do to any accumulator or memory location that can in any way affect its own effective address calculation.
Note that "A: AOJA A," does not mean "increment accumulator 10 and then set the PC to the current value of that accumulator". Instead, the effective address E is calculated first, then the accumulator is incremented, then the PC is set to the remembered value of E.


Up to the index for PDP-10 page.
Maintained by Joe Smith at js-cgi@inwap.com