Rational/R1000s400/Logbook/2021

Fra DDHFwiki
Spring til navigation Spring til søgning

2021-12-31 Final update from 2021

We have started emulating the FIU board too and it is not a total disaster: "TEST_FIU.EM" reports 24 tests passing and 69 tests failing.

Happy New Year!

2021-12-30 From hardware to software

The clock signals are the most important electrical signals in any synchronous logic circuitry, and it is evident from the schematics that a lot of care went into designing good clock circuits. One aspect of this is that the number of chips any particular TTL gate can drive is limited, and therefore duplicated drivers are a very prominent feature of the R1000 schematics:

The 74F37 NAND buffer is designed for driving signals and there 115 of them in the R1000 with 434 of 460 gates are used.

In the hardware computer, such duplicated buffering costs parts, PCB area and power.

In software the duplication costs time, because both of the two NAND gates must be simulated, and for no good reason really, because a single software gate can drive an infinite number of inputs.

Running the "TEST_IOC.EM" through EXPMON in the simulator, causes SystemC to activate the chip classes 26,181,186,645 times and a full 30% of those activations relate to producing and driving the clock signals, because the base clocks run all the time and they run at high frequencies.

So we went through the IOC schematic and de-duplicated all the obvious double buffering, signals named "mumble.B[1-5]" were merged with "mumble.B0" and that reduced it to 23.2 billion activations, a 12% saving.

But there really is no need to produce these clocks with discrete gates, so we created a new component called "CLKGEN" which magically outputs precisely the clocks needed in a R1000 computer, that brought it down to 17.6 billion activations, ⅔ of the original.

A second round over the schematics to perform a "reduction in strength" to move the clock-gating before fan-out in a couple of logic trees brought it further down to 15.3 billion, 58.5% of the original, for the first time making the emulator less than 200 times slower than the hardware.

This is precisely the way we hope to speed up the simulator: Replacing the hardware-way with the software-way, module by module, function by function.

Here are the three clock generation pages, before and after:

In hardware, the IOC board distributes three clock signals, two of which are double buffered, across the backplane to the other cards, which each derive the same 8 "free-running clocks" from them.

In software we will distribute all the clocks from the IOC to the other cards, because we can add more pins to the backplane for free, but each chip we must simulate costs time.

2021-12-29 A day at the races

One of the test-failures we had to solve, involves the three timers available to the microcode on the IOC board, specifically the "TEST_COUNTER_DATA.IOC" experiment failed because two of the three counters (on IOC page 56) did not behave as expected.

The relevant circuitry, slightly rearranged, looks like this:

The signal to load the counter "LOAD_DELAY.S~" is gated by the "CLK.2XE~.B2" clock, as shown on the "timing diagram", and the 74F579 chip reacts to the positive flank of the "Q4~.B1" signal, which is coincident with the rising flank on "CLK.2XE~.B2".

SystemC does a very smart "delta/update" trick, but that does unfortunately save us in this case, because multiple delta/update cycles happen in this moment, and therefore the simulation does not work. Or rather, it might or might not work, depending on the non-deterministic order of things in SystemC's internal data structures: In some cases the F579 sees the clock-flank before the F30 cuts the "LOAD_DELAY.S~" signal, in some cases it happens the other way around.

It does however work in hardware, with plenty of margin, because of this detail in the 74F579 datasheet:

Whatever value we want a 74F579 to use, it must be in place 9.5 nanoseconds before the clock signal goes positive but does not need to be held after the clock flank.

Implementing "Give me the value 9.5 nanoseconds ago" in SystemC can be done, but only at a rather high cost in simulation overhead. It can be done a lot cheaper if one cheats and hardcodes the clock-rate in the F579 class, but for general reasons of sanity, we have decided to make the fix plainly visible:

The output of the "DELAY" components is the same as the input, but delayed an arbitrary 5 nanoseconds, that way there is no longer a race.

2021-12-27 Speeding things up

At the bottom of the previous entry, you will find this line:

   13093.544349 s Wall Clock Time

Testing the IOC board in the emulator took 6½ hour, because the simulation runs more than 200 times slower than the real thing and it had to simulate almost a minute of time.

Fortunately we can do better than that.

Right now the simulator looks like this:

    68K20       C "Musashi"
      |
      |
   DIAGBUS      C
      |
      |
  8051 DIPROC   C & SystemC
      |
      |
     DFSM       SystemC
      |
      |
  IOC BOARD     SystemC

All these parts are driven from the same master crystal in the real computer, and everything runs fast enough that there were no reason to optimize any of this too hard.

But in our case the SystemC stuff runs very slowly, so any and all amount of time spent entirely in the 68K20 or 8051 gets dragged out as well, for no good reason, and both the 68K20 and the 8051 do significant work.

The EXPMON program has an internal script interpreter which is not very performant when the 68K20 CPU only clocks at around 33 kHz.

The 8051, which is not a fast CPU to begin with, each machine cycle being 12 clock cycles long, uses a lot of instructions on "corner-turning" to load of the DUMMY and UIR serial registers.

So the order of the day was "clock decoupling", so that each of the three major parts can run as fast as their simulation code allows.

Of course they cannot just free-wheel, their mutual communication and expectations must keep working, so these three interfaces required attention:

  • The 68K20 cannot send faster on the DIAGBUS than the 8051 can receive.
  • The 68K20 uses the PIT timer to decide of experiments have run too long or DIPROCs have failed to answer.
  • At least one place in the DIPROC firmware, data is fed into the Diagnostic Finite State Machine with no flow-control.

We may have to add more interfaces later, for instance when the R1000 CPU starts interacting with the IOC via the FIFO/SHM circuitry but we will burn that bridge when we get to it.

The 'elastic' buffer used to implement the DIAGBUS, as the name implies, offers flow-control, so the DIPROC simulator can just read from the DIAGBUS as fast as possible, even if that is slower than the 68K20 transmits.

That would have been even easier to implement, if the firmware did not cheat and read the 'SBUF' register twice to calculate a checksum, but it is no biggie:

   /*
    * The DIPROC firmware reads SBUF twice, at both 0x649 and
    * 0x64d, so we cannot use the SCON.RI bit for flow-control.
    * Instead, wait if:
    *      Interrupted and not spinning in 0x646 or 0x6e3
    *      or if SCON.RI is already set
    *      of if SCON.REN is not set
    */

The addresses will obviously have to be tweaked for the DIPROC2 firmware on the MEM32 board.

The second one was even easier: Feed the PIT clock from the SystemC simulation, done.

The last one took some manual work. Obviously any 8051 instruction which does I/O to the DFSM must run at SystemC speed, but a few other instructions must as well, for instance:

   122a f2      MOVX    @R0,A
   122b 41 2d   AJMP    0x122d
   122d e5 a0   MOV     A,P2

It is pretty obvious that the AJMP is only there to put time between the MOVX which starts the DFSM and reading the result it produces.

As far as I can tell, that is an optimization targeting precisely the slow "corner-turning".

With those three interfaces taken care of, the 68K20 and 8051 emulators was let loose, which dropped the wall clock time for the same 'TEST_IOC' run to 55 minutes, doing only 11.5 seconds of simulated IOC board time.

The SystemC still runs at 1/284 real speed, but now we can work on that without needlessly wasting 5½ hour per test run.

2021-12-25 IOC does not fail tests

  CLI> x expmon
  1  32MB MEMORY BOARDS IN PROCESSOR - TOTAL OF 32 MEGABYTES.
  EM> TEST_IOC

         BEGINNING OF IOC BOARD TESTING

  TESTING IOC BOARD TILE 1  -  IOC_DIPROC_TEST

  RESET TEST                                                 PASSED
  TRIGGER SCOPE TEST                                         PASSED
  […]
  TRACE RAM SIMPLE DATA TEST                                 PASSED
  TRACE RAM ADDRESS TEST                                     PASSED
  TRACE RAM CELL TEST                                        PASSED

  END OF IOC BOARD TESTING

  IOC BOARD                                                  PASSED
  EM>

       49.626989500 s simulated
       47.509434200 s SystemC simulation
    13093.544349 s Wall Clock Time
       40.283670 s IOC stopped
    14717443 IOC instructions
    1/263.84 Simulation ratio
    13226.043 s User time
       37.425 s System time
    52188 Max RSS

… and they just launched the James Webb telescope, not a bad X-mas.

2021-12-05 Hacking the MEM32 DIPROC: Habemus Firmware

With a way to execute code from external memory, the remaining task was to write 8051 code to exfiltrate the firmware from the internal ROM.

With the code-protection enabled, the straigtforward use of

   MOVC A,@A+DPTR

does not work, so we need to find that instruction somewhere in the internal ROM and make it do the deed for us.

Instead of actual memory, we are using a tiny ARM processor, which is connected to the 8051's port-0, port-2, CLOCK, ALE and PSEN pins. This allows us to not only give the 8051 the instructions we want, it also allows us to monitor what it does as a result, to the extent something shows up on the external memory interface.

Under the assumption that the MEM32 DIPROC's firmware shares a lot of code with the older DIPROC's we did a scan to locate instructions which give themselves away easily.

If, for instance from the external code we CALL to an address in the internal ROM which happens to contain a 0x22 byte, the 8051 will execute that as a "RET" instruction and return straight back to our code.

(Of course, it would also do that if it hit any trivial instruction, say "CLC C" followed by "RET", but the ARM counts the number of clock- cycles, so we easily tell those apart: Only the fastest returns are direct.)

Similarly, if we hit a 0x02 byte the 8051 will interpret that as a "LJMP" instruction, and if the next byte is larger than or equal to 0x20, the jump is into external memory, and the ARM processor will capture the address.

If we load up the DPTR register with 0x5555, and the A register with 0x55 before the call, any instruction using "@A+DPTR" on code memory, will access location 0x55aa, which, again, the ARM processor detects and reports.

However, for most the addresses, the 8051 goes on to somewhere in internal memory and does not reappear on the external memory interface, so after a timeout the probe fails.

This probing is not fast.

For each location we want to probe, we have to reset the DIPROC and go through the download-experiment-which-corrupts-the-stack gyrations.

To make matters worse, the DIPROC is seriously underclocked to make sure the ARM processor can keep up with the external memory interface.

Combining the necessary with the convenient, we run the DIPROC at 1200 * 64 = 76.8kHz clock-rate, which makes the DIAGBUS run at a standard rate of 1200 bps.

The result of the survey was that the very convenient subroutine at address 0x1393 in the older DIPROC, which we hoped to use, was nowhere to be found in the MEM32 DIPROC's firmware. This does make sense, that routine was used for the short scan-chains, and the MEM32 has none of those.

So time to study the "@A+DPTR" locations we had found.

The "JMP @A+DPTR" instructions were of no use, and most of the "MOVC @A+DPTR" did not return via a "RET(I)" instruction and the ones which did, had mangled the byte they read, before returning.

At this point we would have been stuck if there were not a single 0x93 byte in the internal firmware, because without no "MOVC A,@A+DPTR" instructions, we would have no way to read the internal ROM, fortunately there were plenty of those.

So it was time to roll out the heavy artillery.

Zero the A register, set the address of the location to be read in DPTR, send the first byte of a DIAGBUS "Upload" command, execute a calibrated number of "NOP" instructions and call the "MOVC @A+DPTR" instruction.

With a bit of luck, the 8051 will get a DIAGBUS receive interrupt at just the right moment, where the MOVC instruction has completed the read, but before the next instruction(s) had a chance to mangle it.

The serial interrupt handler very conveniently pushes the A register onto the stack, so some number of clock cycles after the last external memory fetch, we can trust the serial interrupt has happened, and that the 8051 is spinning in the interrupt handler, waiting for the pointer and length bytes to arrive on the DIAGBUS.

At this point the stack contains the return address to our code, the return address for the serial interrupt and the pushed A register, so sending 0x00 and 0x10 for the pointer and length bytes, we get something like this back on the DIAGBUS:

       c0 00 0f cf 0f 06 46 55 45 70 93 06 98 6b 07 55 55 50
                               ----- ----- --
                                 |     |    +--- Byte we want
                                 |     +-------- Interrupted instruction
                                 +-------------- Next address in external memory

Getting that just right took some calibration, and still in 20% of the attempts the interrupt comes too early, in 38% of the attempts it comes too late, but 42% of the time it works.

So now we also have the firmware for the MEM32 DIPROC.

2021-12-03 Hacking the MEM32 DIPROC: First flag captured

It didn't go quite the way I explained in the previous entry: I had forgotten that the 8052 only allows access to Special Function Registers via direct memory access, and the EXP instructions only use indirect memory access.

But I found another way: When a EXP subroutine is called, the return address is pushed on the stack, and the stack lives below the EXP work-area and grows up, so doing a specific number of calls, and then PAUSE will put the stack pointer in the EXP work-area, and then we can over-write it with a download, and make the 8051 return from the serial interrupt routine to an address of our choice.

This experiment does that:

   10 1b                   PC_:    .CODE   EXPERIMENT
   11 00                   R1_:    .DATA   0x0
   12 00                   R2_:    .DATA   0x0
   13 00                   R3_:    .DATA   0x0
   14 00                   R4_:    .DATA   0x0
   15 00                   R5_:    .DATA   0x0
   16 00                   R6_:    .DATA   0x0
   17 00                   R7_:    .DATA   0x0
   18 1b                           .CODE   EXPERIMENT
   19 00                           .DATA   0x0
   1a 0a                           .DATA   0xa
   1b                      EXPERIMENT:
   1b 86  1a                       DEC     0x1a
   1d 51  00  00  19  23           BEQ.W   #0x0000,0x19,0x23
   22 18                           CALL    EXPERIMENT
   23 60                           PAUSE

After that, we send these bytes in the diag_bus:

    0x1a5 ANY DOWNLOAD
    0x002 payload length
    0x031 payload byte #0
    0x073 payload byte #1
    0x0a6 checksum (0x02 + 0x31 + 0x73 = 0x(1)a6)

The 8052 returns out of the mask-ROM:

    IOC.ioc_64.DPROC Instr 0x06aa 32:  |{90 03 b0 1}|RETI|pop(@0x11) -> 0x73|pop(@0x10) -> 0x31|nPC 0x7331/0
    IOC.ioc_64.DPROC OUT OF PROGRAM at 0x7331

0wned!


2021-12-02 Hacking the MEM32 DIPROC for Fun & Profit

I have been pondering ways to defeat the code-protection on the MEM32 DIPROC and I think I have found a way to do it with EXP instructions.

The EXP instructions perform no range-checking, so the 8051 SFR (Special Function Registers) can be modified by downloaded EXP instructions.

The Stack Pointer is a SFR on the 8051, so first part of the attack goes like this is:

1. Move stack pointer into the experiment download area

2. Get the 8051 to push a return address

3. Modify the return address

4. Get the 8051 to return through the modified return address to external code memory

5. Supply external-memory instructions to read out the internal ROM.

The most accesible use of the return is the interrupt handler which receives bytes from the DIAGBUS, because it performs the entire transaction while in the interrupt handler, before executing RETI.

So with a bit more detail the attack looks like this:

Download & execute EXP instructions to:

       store 0x30 into SFR 0x81
       PAUSE

The use of "PAUSE" is important, because it does not reset the stack pointer, like the normal END instruction does:

   016b ; --------------------------------------------------------------------------------------
   016b ; PAUSE       -               |0 1 1 0 0 0 0|m|
   016b ; --------------------------------------------------------------------------------------
   016b                            INS_PAUSE:
   016b 75 04 03     |u   |                MOV     diag_status,#0x03
   016e 21 7d        |!}  |                AJMP    0x17d                   ; Flow J 0x17d
   […]
   0175 ; --------------------------------------------------------------------------------------
   0175 ; END         >R              |0 1 0 1 1 1 0|m|
   0175 ; --------------------------------------------------------------------------------------
   0175                            INS_END:
   0175 75 04 01     |u   |                MOV     diag_status,#0x01
   0178 75 81 06     |u   |                MOV     SP,#0x06
   017b c2 8a        |    |                CLR     TCON.IT1
   017d e5 04        |    |                MOV     A,diag_status
   017f b4 06 fb     |    |                CJNE    A,#0x06,0x17d           ; Flow J cc=NE 0x17d
   0182 a1 1c        |    |                AJMP    EXECUTE                 ; Flow J 0x51c

Then download another set of "EXP instructions" which overwrite the stack, so that instead of returning to the loop at 0x017d…0x017f, the 8051 returns into external memory.

However, we are not done, because the code protection bits also prevent a "MOVC" instruction in external memory from reading the internal code ROM, so we have to find a MOVC instruction in the internal ROM to do the dirty work for us.

The 0xDA "PERMUTE" EXP instruction uses a number of MOVC instructions to read the permutation tables, one of them is in the subroutine starting at 0x13d8 in the normal DIPROC. However, they all have the job of mangling things based on the table they find in code memory, so we need to deduce the "table" they read from CODE memory from the mangling they do.

That part is still "for further study" as the CCITT would say.

2021-11-28 7 down, 7 to go

We are now down to only 7 tests failing on the simulated IOC board:

   DELAY TIMER MACRO EVENT TEST                                         
   SLICE TIMER MACRO EVENT TEST                                         
   EXIT MICRO EVENT TEST                                                
   TRACE RAM SIMPLE DATA TEST                                           
   TRACE RAM ADDRESS TEST                                               
   TRACE RAM CELL TEST                                                  

Rooting out these bugs is slow going: First we need to disassemble the experiment - to the extent that we currently can - then we need to find out what the DFSM (Diagnostic Finite State Machine) steps it uses do, and finally we can try to find out why they do not.

The main cause of trouble continues to be misreadings of the schematics, to the extent that it is always the first working hypothesis that one or more of (0|8, I|T, M|N|H, B|D|S) were confused.

Fixing the ECC was slightly tricky because seven of the nine checkbits are a parity over a "random" selection of the 128 bits in the memory word:

          TYP:0                                                     TYP:63 VAL:0                                                     VAL:63 CHECKBIT
   ECCG16 --------------------------------++++++++++++++++++++++++++++++++ --------------------------------++++++++++++++++++++++++++++++++ +--------
   ECCG17 ++++++++++++++++++++++++++++++++-------------------------------- --------------------------------++++++++++++++++++++++++++++++++ -+-------
   ECCG28 ++++++++-----++---+++-+--+++--+-++++++++-----++---+++-+--+++--+- ++++++++-------+-----+-+--+-+-+----++---+-+-+--++++---++++-+---- --+------
   ECCG29 +++++++++++++-++++---+-------+--+++++++++++++-++++---+-------+-- -+------++++++++-------++----++-----+++++--+-++----+---+--++---+ ---+-----
   ECCG44 ++-----+++++++++--++--+++---+-++++-----+++++++++--++--+++-----++ ---+---+----+++---++++--++++------------++++++++---++++------++- ----+----
   ECCG45 -----++++--+---++++++++++++----+-----++++--+---++++++++++++-+--+ --+--++----+--+-++++++++-----+---++----+--------++++++++----+--+ -----+---
   ECCG46 +-++--+-++--+---++----+-+++++++++-++--+-++--+---++----+-++++++++ ----+-+--++------++---+-+++++++++-++--+--+----++--+-+----++-++-- ------+--
   ECCG61 -++-++----+-++-++-+-++-++-++++---++-++----+-++-++-+-++-++-++++-- +---++-++-+-+--++---+-+--+-----+++---+----+--+---+---+--++++++++ -------+-
   ECCG62 ---++----+++-++--+-+++-+-+-+++++---++----+++-++--+-+++-+-+-+++++ ++++----++-+-+--++-+-------++--+++++++++-+-++---+-------+-----+- --------+

I'm far from an ECC specialist, but I do know the crucial feature of this pattern is that any two columns differ from all the other ones in three places or more, (Hamming-distance >= 3), that way there will be exactly one legal "syndrome" which is closest to any of the illegal ones produced by a single bit error.

I wrote a bit of python code which finds all the "ECCG" chips in the SystemC code, extracts the connections between them and calculates the above map, and then finally calculate the minimum hamming distance and complain if it is less than three. The list of which signals went into the codes which had hamming distance of only two would be the main culprits and as usual it ended up being a 0|8 confusion.

Since the 74F280 chip only has 9 inputs, it tales a tree of them to produce the above result:

          TYP:0                                                     TYP:63 VAL:0                                                     VAL:63 CHECKBIT
   ECCG00 ++++++++-------------------------------------------------------- ---------------------------------------------------------------- --------- 
   ECCG01 --------++++++++------------------------------------------------ ---------------------------------------------------------------- --------- 
   ECCG04 ----------------++++++++---------------------------------------- ---------------------------------------------------------------- --------- 
   ECCG05 ------------------------++++++++-------------------------------- ---------------------------------------------------------------- --------- 
   ECCG08 --------------------------------++++++++------------------------ ---------------------------------------------------------------- --------- 
   ECCG09 ----------------------------------------++++++++---------------- ---------------------------------------------------------------- --------- 
   ECCG12 ------------------------------------------------++++++++-------- ---------------------------------------------------------------- --------- 
   ECCG13 --------------------------------------------------------++++++++ ---------------------------------------------------------------- --------- 
   ECCG02 ---------------------------------------------------------------- ++++++++-------------------------------------------------------- --------- 
   ECCG03 ---------------------------------------------------------------- --------++++++++------------------------------------------------ --------- 
   ECCG06 ---------------------------------------------------------------- ----------------++++++++---------------------------------------- --------- 
   ECCG07 ---------------------------------------------------------------- ------------------------++++++++-------------------------------- --------- 
   ECCG10 ---------------------------------------------------------------- --------------------------------++++++++------------------------ --------- 
   ECCG11 ---------------------------------------------------------------- ----------------------------------------++++++++---------------- --------- 
   ECCG14 ---------------------------------------------------------------- ------------------------------------------------++++++++-------- --------- 
   ECCG15 ---------------------------------------------------------------- --------------------------------------------------------++++++++ --------- 

These sixteen chips calculate a per-byte parity, and similar circuits are present on TYP, VAL and FIU boards, to protect the transfer across the frontplane from one board to another.

Then another 38 chips complete the pattern:

          TYP:0                                                     TYP:63 VAL:0                                                     VAL:63 CHECKBIT
   ECCG18 ---------------------------------------------------------------- ---------------+-----+-+--+-+-+----++--------------------------- --------- 
   ECCG19 ---------------------------------------------------------------- -+---------------------++----++-----+++------------------------- --------- 
   ECCG20 ---------------------------------------------------------------- ----------------------------------------+-+-+--++++---+--------- --------- 
   ECCG21 ---------------------------------------------------------------- ---------------------------------------++--+-++----+---+--+----- --------- 
   ECCG22 -------------++---++-------------------------------------------- -------------------------------------------------------+++-+---- --------- 
   ECCG23 --------+++++-+------------------------------------------------- -----------------------------------------------------------+---+ --------- 
   ECCG24 --------------------+-+--+++--+--------------++----------------- ---------------------------------------------------------------- --------- 
   ECCG25 ---------------+++---+-------+----------+++--------------------- ---------------------------------------------------------------- --------- 
   ECCG26 --------------------------------------------------+++-+--+++--+- ---------------------------------------------------------------- --------- 
   ECCG27 -------------------------------------------++-++++---+-------+-- ---------------------------------------------------------------- --------- 
   ECCG41 ---------------------------------------+----------++--+++-----++ ---------------------------------------------------------------- --------- 
   ECCG42 ----------------------------------------+--+---+--------+++-+--+ ---------------------------------------------------------------- --------- 
   ECCG30 ---------------------------------------------------------------- ---+---+----+++---+++------------------------------------------- --------- 
   ECCG31 ---------------------------------------------------------------- ----+-+--++------++---+---------+------------------------------- --------- 
   ECCG32 ---------------------------------------------------------------- ---------------------+--++++-----------------------+++---------- --------- 
   ECCG33 ---------------------------------------------------------------- --+--++----+--+--------------+---++----------------------------- --------- 
   ECCG34 ---------------------------------------------------------------- ----------------------------------++--+--+----++--+-+----------- --------- 
   ECCG35 ++-----+----------++-------------------------------------------- ------------------------------------------------------+------++- --------- 
   ECCG36 -----++++--+---------------------------------------------------- ---------------------------------------+--------------------+--+ --------- 
   ECCG37 +-++--+--------------------------------------------------------- ---------------------------------------------------------++-++-- --------- 
   ECCG38 ----------------------+++---+-++++------------------------------ ---------------------------------------------------------------- --------- 
   ECCG39 ---------------+--------+++----+-----+++------------------------ ---------------------------------------------------------------- --------- 
   ECCG40 --------++--+---++----+---------+-+----------------------------- ---------------------------------------------------------------- --------- 
   ECCG43 -----------------------------------+--+-++--+---++----+--------- ---------------------------------------------------------------- --------- 
   ECCG47 ---------------------------------------------------------------- +---++-++-+-+--+------------------------------------------------ --------- 
   ECCG48 ---------------------------------------------------------------- ----------------+---+-+--+-----+++---+-------------------------- --------- 
   ECCG49 ---------------------------------------------------------------- ++++----++-+-+-------------------------------------------------- --------- 
   ECCG50 ---------------------------------------------------------------- ----------------++-+-------++--+---------+-+-------------------- --------- 
   ECCG51 -++-++---------------------------------------------------------- ------------------------------------------+--+---+---+---------- --------- 
   ECCG52 ---++----++----------------------------------------------------- --------------------------------------------+---+-------+-----+- --------- 
   ECCG53 ----------+-++-++-+-++------------------------------------------ ---------------------------------------------------------------- --------- 
   ECCG54 -----------+-++--+-+++-+---------------------------------------- ---------------------------------------------------------------- --------- 
   ECCG55 -----------------------++-++++---++----------------------------- ---------------------------------------------------------------- --------- 
   ECCG56 -------------------------+-+++++---++--------------------------- ---------------------------------------------------------------- --------- 
   ECCG57 ------------------------------------++----+-++-++-+------------- ---------------------------------------------------------------- --------- 
   ECCG58 -----------------------------------------+++-++--+-++----------- ---------------------------------------------------------------- --------- 
   ECCG59 ----------------------------------------------------++-++-++++-- ---------------------------------------------------------------- --------- 
   ECCG60 -----------------------------------------------------+-+-+-+++++ ---------------------------------------------------------------- --------- 

In Hamming's original 1950 article, Detecting and Error Correcting Codes the check-bits were arranged so that their numerical value gave the position of the flipped bit.

But this is not a classical Hamming-syndrome.

I suspect the R1000 designers figured out they could save some chips by creating a "syndrome" which recycled some of the sixteen parity bits, and therefore the syndrome of the check-bits are run through a small PROM chip to convert it to a bit position:

One output bit tells if this an "invalid syndrome", 374 of the 512 syndromes consist of just that bit.

If the "invalid syndrome" bit is zero, the other seven bits identify which one of the 128 bits should be flipped.

That leaves 10 syndromes which have the invalid syndrome bit set, but which also presents something in the other seven bits:

   0x40, 0x41, 0x42, 0x43, 0x44, 0x45, 0x46, 0x47, 0x48, 0x7f

Could they be "magic markers" which confirm to diagnostic tests that they got the expected wrong syndrome ?

2021-11-24 IOC page 28 reconstructed

Page 28 is missing in our binder of schematics.

The first thing I did with that binder, almost a decade ago, was to scan it, and page 28 was missing, so it was missing when we got it.

From the titles of the surrounding pages it would appear to be related to memory, somehow:

By cross-referencing the the netlist out of KiCad and the "chipmap" listing the name of each position on the PCB and what was in it, four 74F280 chips were totally unaccounted for, and the KiCad "Electrical Rules Check" complained about nothing feeding the data input of the RAM chips storing the parity, so tonight I went into the lab and "beeped" out the connections of those four chips, and now we have a "hypothesized page 28", with a explanatory note about where the fiction comes from:

2021-11-20 KiCad Schematics online

Redrawing the schematics in KiCad is essentially complete, so I have put the result online in github:

https://github.com/Datamuseum-DK/R1000.HwDoc

RESHA needs to be redrawn, but I think we are one or more sheets shy, and the ones we have look very preliminary, so that is on the "later" list.

IOC lacks page 28, we simply do not have that page. I suspect it was affected by an ECO and never got put back in the binder.

2021-11-09 No luck reading MEM32 DIPROC firmware

I tried reading the firmware out of the MEM32 DIPROC tonight, using the [External Address Hack] but no luck: The 8752 chip clams up and executes no instructions at all when the `EA` input is pulled low.

Intel added three successive "lock bits" to their MCS51 microcontrollers in order to be able to keep the microcode from being copied, the third one is the one which simply disables external code execution ability.

As far as I can tell, the only known way to read out the firmware from a chip so protected, is to decap it and hit the security fuses with a laser, while being very careful not to hit the EPROM bits. Given that we have only three chips, one in each of our three working boards, that is totally out of the question.

One of our MEM32 boards have a much lower serial number than the two others, and a slightly different text on the sticker on the DIPROC, so there is a very small chance, that it may be possible to read the firmware out of that one. That board is in remote storage now, so that will have to wait.

Another unlikely avenue could be to exercise the DIPROC while in the board, via the DIAGBUS, using the `*.M32` files in the DFS filesystem, while capturing the activity on all its pins with a logic analyzer, and then reconstruct the firmware from those traces. If the firmware shares enough code with the DIPROC on the other boards, this just might be feasible, given enough time and sufficient motivation, but we are into "stranded on a desert island" territory there.

Being unable to emulate this firmware obviously takes the "Diagnostic Archipelago" out of the picture as far as the MEM32 board goes, but I do not think that is a show-stopper for us.

Compared to the other boards, the MEM32 board is quite simple, and has no microcode of its own, but gets commanded from part of the FIU boards microcode, via 20 signals on the backplane.

But obviously a complication we could do without.

2021-11-02 Almost 50/50 isn't bad

Almost 20 years ago, at BSDCON2002, John R. Mashey, already then a legend, gave us young geeks a reprise of a already then 20 year old talk, Software Army on the March, and ever since I have wished that tank was captured on video, because it has some of the best wisdom I have ever gotten about software development, in all its forms and shapes.

One particular kind of software development, is where you know where you want to go, but not how to, or even if it is possible to get there.

Frederick P. Brooks tells us to always build a prototype and throw it away, and John R. Mashey took that advise one step further by framing the prototype in terms of the scouts a moving army sends out, to find viable routes etc: You dont send scouts backwards or along routes you think will work, you send them out where things may not work and where you simply have no idea of things will work.

He also told us to be extremely sceptical of the promises made by the scouts and prototypes: A bridge which untroubled bore a scout on a motorbike is not by definition a way to move your tanks across the river.

The R1000 Emulation project has a lot of unknowns but we know it is not impossible, emulating computers is old hat, and the R1000 was even simulated to quite some extent before it was built in the first place. But getting from A to B still involves a lot of path-finding and I think we have the first scout back now, with some good and some bad news.

The good news is that KiCad+SystemC looks like a feasible path for us, the bad news is that it will require some very strong tools to debug problems along the way.

Amongst the major and absolutely inescapable items on the shopping list is a way to snapshot and restart the emulator, otherwise we will waste far too much time getting the emulator to run to the interesting point. Somewhat surprising to me, snapshort/restart is not part of SystemC, but I think I have an plan how to go about it.

I think that is a bridge we have to construct now, along with that some more flexible debugging facilities. So while I go of to a side doing that, enjoy this almost 50/50 success-ratio IOC test-run:


   CLI> x expmon
   1  32MB MEMORY BOARDS IN PROCESSOR - TOTAL OF 32 MEGABYTES.
   EM> TEST_IOC
   
          BEGINNING OF IOC BOARD TESTING
   
   TESTING IOC BOARD TILE 1  -  IOC_DIPROC_TEST
   
   RESET.IOC
   RESET TEST                                                  PASSED
   TRIGGER_SCOPE.IOC
   TRIGGER SCOPE TEST                                          PASSED
   
   TESTING IOC BOARD TILE 2  -  IOC_SCAN_CHAIN_TEST
   
   TEST_PAREG_SCAN.IOC
   PARITY REGISTER SCAN TEST                                   PASSED
   TEST_SYNDROME_SCAN.IOC
   SYNDROME REGISTER SCAN TEST                                 PASSED
   TEST_UIR_SCAN.IOC
   MICRO INSTRUCTION REGISTER SCAN TEST                        PASSED
   TEST_RDR_SCAN.IOC
   "DUMMY" READ DATA REGISTER SCAN TEST                        PASSED
   
   TESTING IOC BOARD TILE 3  -  IOC_WCS_TEST
   
   TEST_WCS_DATA.IOC
   SIMPLE WCS DATA TEST                                        PASSED
   TEST_WCS_ADDRESSING.IOC
   WCS ADDRESS TEST                                                            FAILED
   
             FAILING EXPERIMENT IS :  TEST_WCS_ADDRESSING
   
   TEST_WCS_BITS.IOC
   WCS CELL TEST                                               PASSED
   
   TESTING IOC BOARD TILE 4  -  IOC_ENABLE_TEST
   
   TEST_ADDRESS_ENABLES.IOC
   ADDRESS BUS SOURCE DECODER TEST                             PASSED
   TEST_FIU_ENABLES.IOC
   FIU BUS SOURCE DECODER TEST                                 PASSED
   TEST_TV_0_ENABLES.IOC
   TV BUS SOURCE DECODER TEST - NO DUMMY_NEXT OR CSA_HIT                       FAILED
   
             FAILING EXPERIMENT IS :  TEST_TV_0_ENABLES
   
   TEST_TV_1_ENABLES.IOC
   TYPE VAL BUS SOURCE DECODER TEST - WITH CSA_HIT                             FAILED
   
             FAILING EXPERIMENT IS :  TEST_TV_1_ENABLES
   
   TEST_TV_2_ENABLES.IOC
   TYPE VAL BUS SOURCE DECODER TEST - WITH DUMMY_NEXT                          FAILED
   
             FAILING EXPERIMENT IS :  TEST_TV_2_ENABLES
   
   TEST_TV_3_ENABLES.IOC
   TV BUS SOURCE DECODER TEST - WITH DUMMY_NEXT & CSA_HIT                      FAILED
   
             FAILING EXPERIMENT IS :  TEST_TV_3_ENABLES
   
   TESTING IOC BOARD TILE 5  -  IOC_TIMER_TEST
   
   TEST_COUNTER_DATA.IOC
   TIMER DATA TEST                                                             FAILED
   
             FAILING EXPERIMENT IS :  TEST_COUNTER_DATA
   
   TESTING IOC BOARD TILE 6  -  IOC_ECC_TEST
   
   TEST_SYNDROME_TRANSCEIVER.IOC
   SYNDROME TRANSCEIVER (ECC) TEST                             PASSED
   TEST_CHECKBITS.IOC
   CHECKBIT (ECC) TEST                                                         FAILED
   
             FAILING EXPERIMENT IS :  TEST_CHECKBITS
   
   TEST_MULTIBIT_ERROR.IOC
   MULTIBIT ERROR (ECC) TEST                                   PASSED
   TEST_ECC.IOC
   FULL ECC TEST                                                               FAILED
   
             FAILING EXPERIMENT IS :  TEST_ECC
  
   TESTING IOC BOARD TILE 7  -  IOC_EVENTS_TEST
   
   TEST_MACRO_EVENT_DELAY.IOC
   DELAY TIMER MACRO EVENT TEST                                                FAILED
   
             FAILING EXPERIMENT IS :  TEST_MACRO_EVENT_DELAY
   
   TEST_MACRO_EVENT_SLICE.IOC
   SLICE TIMER MACRO EVENT TEST                                                FAILED
   
             FAILING EXPERIMENT IS :  TEST_MACRO_EVENT_SLICE
   
   TEST_MICRO_EVENT_ECC.IOC
   TEST_MICRO_EVENT_EXIT.IOC
   EXIT MICRO EVENT TEST                                                       FAILED
   
             FAILING EXPERIMENT IS :  TEST_MICRO_EVENT_EXIT
   
   TEST_IOC_CLOCKSTOP.IOC
   IOC CLOCKSTOP EVENT TEST                                    PASSED
   
   TESTING IOC BOARD TILE 8  -  IOC_TRACE_TEST
   
   TEST_TRACE_DATA.IOC
   TRACE RAM SIMPLE DATA TEST                                                  FAILED
   
             FAILING EXPERIMENT IS :  TEST_TRACE_DATA
   
   TEST_TRACE_ADDRESSING.IOC
   TRACE RAM ADDRESS TEST                                                      FAILED
   
             FAILING EXPERIMENT IS :  TEST_TRACE_ADDRESSING
   
   TEST_TRACE_BITS.IOC
   TRACE RAM CELL TEST                                                         FAILED
   
             FAILING EXPERIMENT IS :  TEST_TRACE_BITS
   
   END OF IOC BOARD TESTING
   
   IOC BOARD                                                                   FAILED
   
   14   TEST(S)   FAILED
   EM>

2021-10-26 Getting somewhere

I have found a set of TEST*.EM files, which seem to be a collection of per-board test and I have started with the IOC board, since the simulator runs fastest with only one board, and the IOC produces the clock signals.

Status so far:

   CLI> x expmon
   1  32MB MEMORY BOARDS IN PROCESSOR - TOTAL OF 32 MEGABYTES.
   EM> TEST_IOC

          BEGINNING OF IOC BOARD TESTING

   TESTING IOC BOARD TILE 1  -  IOC_DIPROC_TEST

   RESET TEST                                                 PASSED
   TRIGGER SCOPE TEST                                         PASSED

   TESTING IOC BOARD TILE 2  -  IOC_SCAN_CHAIN_TEST

   PARITY REGISTER SCAN TEST                                  PASSED
   SYNDROME REGISTER SCAN TEST                                PASSED
   MICRO INSTRUCTION REGISTER SCAN TEST                       PASSED
   "DUMMY" READ DATA REGISTER SCAN TEST                       PASSED

   TESTING IOC BOARD TILE 3  -  IOC_WCS_TEST

   SIMPLE WCS DATA TEST                                                 FAILED

             FAILING EXPERIMENT IS :  TEST_WCS_DATA

The bugs I have fixed so far have all been variants of typos, either in the schematics or in the component models.

It is my general policy to run development versions of software where I can, and the development version of KiCad seems to be somewhat in transit with respect to Electrical Rule Checks, at least I think it should have spotted one of my typos, but it did not.

It is probably about time to start adding some teeth to the code which converts the KiCad netlist to SystemC, so it spots trouble when all boards are "plugged in", since KiCad only sees one board at a time. This will also automatically plug some of the holes in KiCad's per-board ERC.

My F245AB component model did not react to changes on pin18, because I accidentally had left that pin out of the "sensitive" clause.

2021-10-25 First test passed

It may not do much, but this is officially the first passed test for the simulator:

   CLI> x expmon
   1  32MB MEMORY BOARDS IN PROCESSOR - TOTAL OF 32 MEGABYTES. 
   EM> FIU_RESET_TEST
   
   TESTING FIU BOARD TILE 1  -  FIU_RESET_TEST 
   
   SIMPLE_DIAGNOSTIC_COMMAND                                  PASSED 
   EM> 

The simulator can be told which boards to "plug in", and in this case only the IOC (which generates the clocks) and the FIU board were active, so the simulator ran quite fast:

         2.903304450 s simulated
         0.722312450 s SystemC simulation
       272.648054 s Wall Clock Time
         0.424236 s IOC stopped
     4236823 IOC instructions
     1/93.91 Simulation ratio
       272.853 s User time
         2.134 s System time
     61428 Max RSS

Now we need to create a "Continuous Integration" facility, to run all the "*.EX" files on the DFS filesystem and record their outcomes.

2021-10-20 Bug found in TYP schematics

One of the really big gambles about basing the software emulation on the schematics, is the uncertainty if the schematics we have are truthful.

The RESHA schematics look unfinished, but fortunately they are rather peripheral to the project and there are relatively few components on that board.

Until now the other schematics seems to have held up well, but today I spotted the first inconsistency on one of the TYP schematics: Two chips both named DIXCV1.

Investigation of the S400 board images, convince me that the topmost one is DIVXCV0, and I see no realistic way the schematic could have contained that, and subsequent printing/copying/scanning etc. turned that into …1 as seen in the image.

This is not a problem, in the sense that it is not information which affects the netlist produced, but it does tell us to be (more) critical.

It has been the plan all along to digitize as many data sources as possible, in order to cross-reference all the information we have, in an attempt to detect discrepancies between the schematics and reality, for instance by typing in the chip names from the solder side silk screen, in our ChipMaps page, and later the chip types from the component side as well.

Hopefully that "census" matches the completed schematics, otherwise we have a LOT more work to do.

2021-10-17 Hole through the diagbus

We have hole through the simulated diagbus now:

   CLI> x novram

   Options are:
       0 => Exit.
       1 => Display novram contents.
       2 => Modify  novram contents.
       3 => Change TCP/IP board serial number.
   Enter option : 1
           Part   Serial  Artwork    ECO     Date of
   Board  Number  Number  Revision  Level  Manufacture
   IOC     49      10295    3        13     10-JUL-92
   VAL     0       0        0        0      ??-???-??
   TYP     0       0        0        0      ??-???-??
   SEQ     45      10571    2        5      12-JAN-93
   FIU     0       0        0        0      ??-???-??
   MEM0    0       0        0        0      ??-???-??
   RESHA   41      10272    3        13     24-JUN-92
   TCP/IP (CMC) board serial number is 1671

The values from the SEQ board comes from the SystemC model of the Xicor X22C12 NOVRAM chip via the simulated i8052 DIPROC across the simulated diagbus to the simulated M68K20.

Getting it work involved hacking som random usleep(3) calls into the M68K20 side of the diagbus, so it would not time the DIPROC out before it got a chance to wiggle the READ_NOVRAM_DATA.SEQ experiment through the SystemC simulation.

The easiest way, and in many ways the correct way, to handle this would be let the SystemC 20MHz clock drive the simulated IOC.

But that would needlessly slow down the IOC simulation, which is useful on its own for various research purposes.

It looks like the DIPROC timeout test is driven by the PIT timer, so maybe tying only that to the SystemC clock, when that is running, will be enough ?

Entering the schematics in KiCad is relatively painless, it does get a bit slow near completion of a board, but I get through it. It looks like the completed schematics will generate a tad over 100KLOC of C++/SystemC code.

2021-10-09 Time is an illusion

I concluded the previous entry by noting that the NOVRAM saw an all-ones address, that is now solved, and it transpired to be quite interesting.

The R1000 is, by and large, a synchronous logic system, there is a centralized clock of 20 MHz and things happen on its beat.

Since things often have to happen sequentially, it is very common to divide the main clock down and provide for clock-phases.

The R1000 has four phases, named Q1 to Q4, each of which is 50 nanoseconds long, and these signals are produced locally on each board, if the board needs them. Q2 and Q4 are the popular ones, al boards seem to have them, while Q1 and Q3 are less used.

In practice the signals are produced by AND'ing a version of the "2X" and a version of the "1X" clock signals, these being 10MHz and 5 MHz respectively, divided down from the 20MHz, the waveforms looks something like this:

The 2X and the H1 signals are fed to a NAND gate, which produces a Q3~ low pulse, whenever they both are high.

The problem arises because the two inputs trace places from one being low and the other high to the opposite.

In real life that happens simultaneously, the R1000 clock circuits are designed very carefully to make sure of that.

But in the SystemC simulation these two events are simulated sequentially, and depending on the order, that can create a zero-width glitch, which will trigger downstream edge-triggered circuits.

Here is the debugging output where I first spotted it:

       @4721125 ns     sc_module FIU.fiu_63.U6317_F37 11|0
       @4721175 ns     sc_module FIU.fiu_63.U6317_F37 10|1
       @4721175 ns     sc_module FIU.fiu_65.U6516_F74  CLK 1111|10
       @4721225 ns     sc_module FIU.fiu_63.U6317_F37 11|0
       @4721225 ns     sc_module FIU.fiu_63.U6317_F37 01|1                     <-----
       @4721225 ns     sc_module FIU.fiu_65.U6516_F74  CLK 1011|01
       @4721275 ns     sc_module FIU.fiu_63.U6317_F37 00|1
       @4721325 ns     sc_module FIU.fiu_63.U6317_F37 01|1
       @4721325 ns     sc_module FIU.fiu_63.U6317_F37 11|0
       @4721375 ns     sc_module FIU.fiu_63.U6317_F37 10|1
       @4721375 ns     sc_module FIU.fiu_65.U6516_F74  CLK 1011|01
       @4721425 ns     sc_module FIU.fiu_63.U6317_F37 11|0
       @4721425 ns     sc_module FIU.fiu_63.U6317_F37 01|1                     <-----
       @4721425 ns     sc_module FIU.fiu_65.U6516_F74  CLK 1011|01
       @4721475 ns     sc_module FIU.fiu_63.U6317_F37 00|1
       @4721525 ns     sc_module FIU.fiu_63.U6317_F37 01|1
       @4721525 ns     sc_module FIU.fiu_63.U6317_F37 11|0
       @4721575 ns     sc_module FIU.fiu_63.U6317_F37 10|1
       @4721575 ns     sc_module FIU.fiu_65.U6516_F74  CLK 1011|01
       @4721625 ns     sc_module FIU.fiu_63.U6317_F37 11|0
       @4721625 ns     sc_module FIU.fiu_63.U6317_F37 01|1                     <-----
       @4721625 ns     sc_module FIU.fiu_65.U6516_F74  CLK 1011|01
       @4721675 ns     sc_module FIU.fiu_63.U6317_F37 00|1

Notice how at ....25 nanoseconds, the F37 output goes low and high with the same timestamp, and how that triggers the F74.

If SystemC had happened to update the two inputs in the other order, everything would be fine.

Interestingly, and certainly worth knowing, this glitch does not appear in the "wave-file" output:

Notice how the "NEW_CMD_B_inv" signal (output of F74) goes low despite there being no positive edge on the "fiu_globals_Q4_inv..." signal.

After thinking it over on a bike-ride, I decided that the cleanest and most efficient fix was to generate the four phase signals on the IOC board, where the 20MHz clock was available to synchronize them, and distribute them to the other boards via four extra pins on the backplane:

Having done that, the NOVRAM chip sees far more interesting addresses:

       @4191275 ns     sc_module SEQ.seq_83.U8315_NOVRAM  cs_ 1 store_1 recall_ 1 we_ 1 a 00000010 dq ZZZZ
       @5065675 ns     sc_module SEQ.seq_83.U8315_NOVRAM  cs_ 0 store_1 recall_ 1 we_ 1 a 00000011 dq ZZZZ
       @5070875 ns     sc_module SEQ.seq_83.U8315_NOVRAM  cs_ 1 store_1 recall_ 1 we_ 1 a 00000011 dq ZZZZ
       @5737675 ns     sc_module SEQ.seq_83.U8315_NOVRAM  cs_ 0 store_1 recall_ 1 we_ 1 a 00000100 dq ZZZZ
       @5742875 ns     sc_module SEQ.seq_83.U8315_NOVRAM  cs_ 1 store_1 recall_ 1 we_ 1 a 00000100 dq ZZZZ
       @6617275 ns     sc_module SEQ.seq_83.U8315_NOVRAM  cs_ 0 store_1 recall_ 1 we_ 1 a 00000101 dq ZZZZ
       @6622475 ns     sc_module SEQ.seq_83.U8315_NOVRAM  cs_ 1 store_1 recall_ 1 we_ 1 a 00000101 dq ZZZZ
       @7289275 ns     sc_module SEQ.seq_83.U8315_NOVRAM  cs_ 0 store_1 recall_ 1 we_ 1 a 00000110 dq ZZZZ
       @7294475 ns     sc_module SEQ.seq_83.U8315_NOVRAM  cs_ 1 store_1 recall_ 1 we_ 1 a 00000110 dq ZZZZ
       @8168875 ns     sc_module SEQ.seq_83.U8315_NOVRAM  cs_ 0 store_1 recall_ 1 we_ 1 a 00000111 dq ZZZZ
       @8174075 ns     sc_module SEQ.seq_83.U8315_NOVRAM  cs_ 1 store_1 recall_ 1 we_ 1 a 00000111 dq ZZZZ

Next: Run the SystemC code in "real-time" and have it send the responses back to the IOC processor via the DIAGBUS.

2021-10-06 The worlds largest Blinky works!

I have tied the SystemC code, including the i8051 emulator into the IOC emulator, and have now seen the Chip-Select pin of the SEQ's NOVRAM go low 31 times.

Let me explain the sequence of events that takes us there:

  • The simulated IOC 68k20 resets into the IOC-EEPROM
  • It completes the selftests (although we skip some of them)
  • It loads the KERNEL.0, FS.0 and PROGRAM.0 from the simulated disk
  • We ask to enter the CLI
  • We execute the CLI command "x novram"
  • This command excutes using the "dummy diproc" code, so all it gets back is zeros.
  • The traffic from the IOC to the DIAGBUS is written to the file "_diag" as we go.
  • When the CLI prompt comes back, we launch the SystemC code with the IOC and SEQ boards, for 27 milliseconds
  • Special-hacky-code reads the "_diag" file and sends those bytes to the DIAGPROCs, essentially replaying the "x novram" programs traffic.
  • The IOC's DIAGPROC has no NOVRAM, but it handles the traffic correctly
  • The SEQ's DIAGPROC downloads the experiment (http://datamuseum.dk/aa/r1k_dfs/32/32b885caa.html READ_NOVRAM_DATA.SEQ)
  • The SEQ's DIAGPROC executes the experiment, wiggling the simulated i8051's pins in the SystemC model.
  • The SystemC model propagates these wigglings through the SystemC model.
  • The wiggling pins trigger the diagnostic finite state machine, where DPROM5 and DUIRG5 produce control signals for the NOVRAM chip.
  • The SEQ's NOVRAM chip sees its Chip Select pin go low/high 31 times.
  • The SEQ's DIAGPROC uploads the result of the experiment.

With 143754 lines of C and C++ source code, this is probably the worlds largest Blinky program :-)

Now the rest is really just debugging and the first question is: Why does the NOVRAM get all-ones address bits ?

2021-10-02 Schematics, i8051 emulator and a snag

I have the i8051 emulator code tied into the SystemC code, and on five of the six "boards", the 8051 correctly figures out which address to use on the serial diagnostic bus:

   DIAGPROC IOC.ioc_63.U6302_8051 Step PC 0x063a MOV       0x03,A  direct 0x03 is 0x04  nPC 0x063c
   DIAGPROC FIU.fiu_65.U6509_8051 Step PC 0x063a MOV       0x03,A  direct 0x03 is 0x03  nPC 0x063c
   DIAGPROC TYP.typ_60.U6007_8051 Step PC 0x063a MOV       0x03,A  direct 0x03 is 0x06  nPC 0x063c
   DIAGPROC VAL.val_63.U6307_8051 Step PC 0x063a MOV       0x03,A  direct 0x03 is 0x07  nPC 0x063c
   DIAGPROC SEQ.seq_82.U8203_8051 Step PC 0x063a MOV       0x03,A  direct 0x03 is 0x02  nPC 0x063c
   DIAGPROC MEM32.mem32_33.U3303_8051 Step PC 0x063a MOV   0x03,A  direct 0x03 is 0x00  nPC 0x063c

But not the MEM32.

Studying the DIPROC firmware I dumped and the MEM32 schematic, there is no way it can get the correct address (0xc for MEM0 and 0xe for MEM2), because the chip which drives the signal is attached to PORT-2 and the firmware reads the address from PORT-1.

My tentative conclusion is that the DIPROC runs a different firmware on MEM32 boards, and there is some convincing evidence to support that:

The chip is a P8752, an EPROM model, where the other boards have a P8052AH mask-ROM model.

The checksum listed on the paper-label on the chip "0184" matches the entry in the BOM, but not the firmware I have dumped.

If only DIPROC chip had been socketed, this would have been a trivial matter, but it is soldered in, on all the three MEM32 boards we have.

2021-09-12 Schematics → KiCad → SystemC

I have spent some time in trains recently, and used it to redraw 25-ish pages of schematics in KiCad.

Each board has it's own KiCad project, from which a netlist is exported.

These netlists are chewed on by a python3 program which produces relatively well-structured SystemC source code.

And now the SystemC source code also compiles.

Next it needs to be hooked up with my own i8051 emulator and the IOC-processor emulation we already have running.

/phk

2021-09-03 Power Supply Replacement

Peter made a new back-plate for our replacement power-supplies:

We reuse the rails from the original power supply so it slides right into place:

I have not attached the control-cable through which the R1000 can instruct the PSU to raise or lower the voltage for diagnostic purposes, in fact I am not sure I have the nerves to experiment with that at all.

The current through the two orange garden-hoses is 155-165 Ampere, and people are apt to exclaim "but it is only 5 volts, what's the problem ?"

The problem is resistance, and since I have the numbers from my check at hand, let's go through the math:

On the two output terminals of the PSU, the voltage is 5.1146 Volt.

On the two huge spade lugs it is already down to 5.1111 Volts.

The difference is only 3.5 milliVolts, 0.0035 Volt, but at 160 Ampere that means that the contact surfaces between the PSU's terminals and the spade lugs get heated by 0.56 Watt.

At the copper conductors crimped into the spade lugs, the voltage is 5.1034 Volts, a loss of 7.7 milliVolts and 1.23 Watts of heating.

At the copper in the other end of the cables we are down to 5.0439 Volts, a whopping 59.5 milliVolts lost and 9.52W at heating.

Yes, the two cables do indeed warm noticeably when the machine runs.

From the copper in the cable to the terminals on the backplane we loose another 8.5 milliVolts, another 1.36 Watt of heating.

In total the loss from the PSU to the backplane is 79.2 milliVolts, and 12.67Watt.

Using Ohms law we find that this is a resistance of 0.0792 Volt / 160 Ampere = 0.000495 Ohm.

And since we are doing this:

The load resistance of the computer, as seen from the PSU is: 5.1146 Volt / 160 Ampere = 0.032 Ohm.

Or as normal people know it: A short-circuit.

I suspect that reason the original power-supply is ailing, (See more about this in our logbook) is that somebody at some time in its history did not clean the output terminals before clamping into a machine, and the increased resistance meant it had to deliver a higher voltage to satisfy the feedback circuit on the RESHA board.

It is absolutely trivial to end up with a contact resistance of several milliohms or even ten, if great care is not taken at these amperages, and for each milliohm extra resistance, the PSU must deliver 160 Ampere * 0.001 Ohm / 5 V = 3.2% higher output voltage.

By not connecting the control-cable to the power-supply, there is no risk of this happening, the power-supply outputs the voltage I have trimmed it to, 5.1146 Volts, and if there is a bad connection, the computer will see a lower voltage and act flaky, but the magic smoke stays put.

2021-08-24 Finally getting somewhere

We have started to really investigate the R1000 disk contents now, and as we do so, it have become clear that we would have been severely time-constrained on our regular thursday evenings: Some commands simply takes hours to run, in some cases many hours.

For instance the lib.space() command does pretty much what the unix du(1) command does, but it takes 3+ hours if you run it from the top of the filesystem.

Having the machine here in my own lab, means I can turn it on, start such a command and go about my regular business while it chews its way through the disk images.

We aslso have a network gateway now, a small FreeBSD VM where all TCP options have been disabled, so the CMC network processor will talk to it. Accounts will be handed out generously, mail me your ssh-key.

We need to figure out a solution for terminal emulation, but in the mean time FTP is usable to go spelunking in the filesystem.

Hopefully the information we gathered that way will enable me to finally complete AutoArchaeologist modules for the backup-tape and the raw disk images.

/phk

2021-08-03 Long overdue update

Datamuseum.dk has lost the practically free lease on our huge basement, and have not yet found new digs, so we are busy moving the entire collection into storage.

Or rather, not quite everything, the R1000 has taken up residence in my personal lab for the duration, so that we can continue our activites with this wonderful piece of kit.

Predictably the trouble-prone power supply threw a tantrum as first point of business, and I have therefore relieved it of duty.

We received a couple of the really huge 3kW supplies from PAM, and after a careful check & renovation in Peter's lab, they will probably be ready for service.

In the mean-time the R1000 will be powered by our fall-back solution: A "Cosel PBA1000F-5" which delivers 5V/200 Ampere, more than plenty for the 150-160A which the R1000 draws, and a "Cosel PBW50F-12-N" for the ±12V, which only drives the fans and the RS-232 port level converters, because we use the "SCSI-SD" rather than the original 5¼" full-height "Wren" disk-drives.

R1000 in PHK's Lab

Next up is configuring a firewall/jump-host so the rest of the team can access the machine across the net on whatever evening we can agree on.

The R1000 will only be turned on when needed, even with the new much more efficient power-supplies, the R1000 draws a full kilowatt, which my A/C has to get rid of again, running it 24*365 would cost approx €3000 in electricity.

/phk

2021-06-29 Ten outstanding PROMs read

A scheduling conflict made it possible for me to visit the collection this afternoon, where I unsoldered the ten bipolar PROMS from the spare /200 board, read them, and remounted them in sockets.

Checksums match the BOM document, but one of the FIU proms have a surprising content, more about that when I have investigated it further.

Also scanned the schematics again, this time in 600dpi grayscale. It is clearly an improvement, but I have a hard time getting used to 2.5GB pdf files. Will have to look into ways to reduce the size.

Still working on brute-forcing the MEM32 GAL chips.

/phk

2021-06-15 PSU trouble diagnosed

The trouble mentioned on 2021-05-25 have been diagnosed: The -12V supply is sick.

Pulling the -12V fuse and using a bench-supply instead, the machine works fine, and draws 1.4A from the -12V rail, most of it from the fans on that rail.

2021-06-10 Reading programmable chips

I had a couple of chances to visit the collection after work, and have read as many of the programmable chips as could.

They go under Bits:Keyword/RATIONAL_1000/SW in the bit archive.

Thanks to the Schematic Bill Of Materials document, I am able to verify the checksums my reads, except where revisions differ between the BOM and what I find on the PCBs.

The following chips are outstanding:

  • FIU PA020-01 - Not socketed
  • FIU PA021-01 - Not socketed
  • FIU PA022-01 - Not socketed
  • FIU PA023-01 - Not socketed
  • FIU PA024-01 - Not socketed
  • SEQ PA040-02 - Not socketed
  • SEQ PA041-01 - Not socketed
  • SEQ PA042-01 - Not socketed
  • SEQ PA043-01 - Not socketed
  • SEQ PA044-01 - Not socketed
  • MEM32 GAL-SETGAL-01 - Checksum mismatch, read=0xaf2d, BOM=0xa744
  • MEM32 GAL-DISTGAL-02 - Bad read, copy protected? BOM has -01
  • MEM32 GAL-MARGAL-02 - Bad read, copy protected? BOM has -01
  • MEM32 GAL-MUXLGAL-02 - Bad read, copy protected? BOM has -01
  • MEM32 GAL-DIBGAL-02 - Bad read, copy protected? BOM has -01
  • MEM32 GAL-TPARGAL-02 - Bad read, copy protected? BOM has -01

2021-06-03 Researching SystemC

I have spent some time researching SystemC as a possible vehicle for simulating the R1000 CPU.

SystemC is "just" a bunch of C++ classes which can express a system emulation at pretty much any level of abstraction, from gate level to IP-component composed SoC device.

There are two huge attractions in SystemC.

The first is that we can use the IOC, the EXPeriments in the DFS filesystem and the "Diagnostic Archipelago" to debug the simulation.

The second is that we can start with a simulation at chip-level, by capturing net-lists from the schematics, and debug that until it works correctly (but likely very slow), and then improve performance by replacing the chip-level modules with higher level abstractions written in C++.

The bad news is that we would need to capture the net-lists to do so, and there are 417 dense pages in the schematic PDF files.

The best way to approach that task, would probably be to create our own symbol library in KiCad, such that the symbols match the Rational symbols used in the schematics, both with respect to pin placement and naming (the symbols in the schematics use the "Bit0 is MSB" convention), so that the result will look as identical as possible to the original schematics.

The KiCad produced net-list, would then have to be post processed into a SystemC "net-list", but that seems to be a mostly trivial data-conversion without too many problems.

The i8052 diag processors, and the IOC's 68k20 processor would need to be tied into the SystemC simulation, but that also seems doable.

While SystemC can be resynthesized into HDL languages (Verilog, VHDL), and while it could be fun to have a R1000 running in a FPGA, I have not research that aspect beyond browsing a document listing all the C++ things you are not allowed to do in that case. At the very least it would require either a HDL or real HW for the 68k20 and the i8052s.

A key question is how fast a chip-level simulation in SystemC actually would run?

If we are talking kHz clock rates, optimization would have to start very early in the process, if we are into MHz, we can postpone that to later.

I have talked to two SystemC specialists about it, and they were not very optimistic about the speed of such a simulation.

The best way forward is probably to make a prototype, for instance of the state machines around one of the i8052 processors, to get an idea how tedious the process would be, and learn something useful about how the diagnostic subsystem works along the way.

/phk

2021-05-27 i8052 Diagnostic Processor mask rom read

I managed to read out the mask rom contents of the i8052 diagnostic processor I borrowed from the /200 VAL card, now available in a bitarchive near you: Bits:30002517.

More about how the readout was done

A cursory disassembly looks like it makes sense.

2021-05-25 R1000 Stille working - sorta

For the first time in half a year I had a chance to enter our collection and of course I tried to boot the R1000.

It worked fine, and I shut it down again.

Then sometime later I remembered I wanted to dump the NVRAM content from the boards, powered it on and got no output on the console.

Thanks to the handy LEDs on the RESHA board, it took me only a moment to realize that the +12V supply was missing.

That prevented the console RS-232 driver from sending usable voltages down the line, perfectly explaining why I got no output.

Hypothesis at this point: We probably violated the PSU's minimum load specification. Reasoning: It is built to supply four 5¼" full-height high performance SCSI drives, but because we use the SCSI-SD emulator, it is only loaded by the two RS-232 ports.

I also read the EEPROMs from the ENP100 processor in PAM's machine, and a handful of 512x8 bipolar PROMS from the i8052/diag circuit on the /200 VAL board. These files are in the bitarchive under Bits:Keyword/RATIONAL_1000/SW

2021-03-14 Microcode spelunking

Now that the IOC-emulator has definitively told us which microcode bits go to which cards, work has (re)started to figure out what they mean.

This work takes place in a separate github repos called [R1kMicrocode] and we can already produce plots such as this one, showing the microinstructions involved in floating point arithmetic:

And listings of the actual microcodes:

   2860:
   seq_cond_sel            0a VAL.ALU_LT_ZERO(late)
   seq_en_micro             0
   seq_latch                1
   fiu_len_fill_lit        74 zero-fill 0x34
   fiu_len_fill_reg_ctl     1 Load Literal     Load Literal
   fiu_load_oreg            1 hold_oreg
   fiu_oreg_src             0 rotator output
   fiu_tivi_src             4 fiu_var
   typ_alu_func            1a PASS_B
   typ_b_adr               16 CSA/VAL_BUS
   typ_c_adr               3b GP 0x4
   typ_c_mux_sel            0 ALU
   val_a_adr               15 ZERO_COUNTER
   val_alu_func            1a PASS_B
   val_b_adr               05 GP 0x5
   ioc_adrbs                2 typ
   ioc_fiubs                1 val
   ioc_tvbs                 2 fiu/val


2021-03-06 IOC emulation usable

After a month, the IOC part of the emulator is now sufficiently usable to allow us to start working on the other parts of the system:

   R1000-400 IOC SELFTEST 1.3.2
       512 KB memory ... [OK]
       Memory parity ... [OK]
       I/O bus control ... [OK]
       I/O bus map parity ... [OK]
       I/O bus transactions ... [OK]
       PIT ... [OK]
       Modem DUART channel ... Warning: DUART crystal out of spec! ... [OK]
       Diagnostic DUART channel ... [OK]
       Clock / Calendar ... Warning: Calendar crystal out of spec! ... [OK]
   Checking for RESHA board
       RESHA EEProm Interface ... [OK]
   Downloading RESHA EEProm 0 - TEST
   Downloading RESHA EEProm 1 - LANCE
   Downloading RESHA EEProm 2 - DISK    - Warning: Detected Checksum Error
   Downloading RESHA EEProm 3 - TAPE    - Warning: Detected Checksum Error
       DIAGNOSTIC MODEM ... DISABLED
       RESHA DISK SCSI sub-tests ... [OK]
       Local interrupts ... [OK]
       Illegal reference protection ... [OK]
       I/O bus parity ... [OK]
       I/O bus spurious interrupts ... [OK]
       Temperature sensors ... [OK]
       IOC diagnostic processor ... [OK]
       Power margining ... [OK]
       Clock margining ... [OK]
   Selftest passed
   
   Restarting R1000-400S March 7th,  20EO at 19:54:42
   
   Logical tape drive 0 is an 8mm cartridge tape drive.
   Logical tape drive 1 is declared non-existent.
   Logical tape drive 2 is declared non-existent.
   Logical tape drive 3 is declared non-existent.
   Booting I/O Processor with Bootstrap version 0.4
   
   Boot from (Tn or Dn)  [D0] :
   Kernel program (0,1,2) [0] :
   File system    (0,1,2) [0] :
   User program   (0,1,2) [0] :
   
   Initializing M400S I/O Processor Kernel 4_2_18
   Disk  0 is ONLINE and WRITE ENABLED
   Disk  1 is ONLINE and WRITE ENABLED
   
   Logical Tape 0, physical drive  0 is declared in the map but is unreachable.

   IOP Kernel is initialized
   Initializing diagnostic file system ... [OK]
   ====================================================

   Restarting system after loss of AC power
   
   CrashSave has created tombstone file R1000_DUMP1.
   READ_NOVRAM_DATA.TYP
   READ_NOVRAM_DATA.VAL
   READ_NOVRAM_DATA.FIU
   READ_NOVRAM_DATA.SEQ
   READ_NOVRAM_DATA.M32
   >>> Automatic Crash Recovery is disabled
   
   >>> NOTICE: the EPROM WRT PROT switch is OFF (at front of RESHA board) <<<
   >>> WARNING: the system clock or power is margined <<<
   CLI/CRASH MENU - options are:
     1 => enter CLI
     2 => make a CRASHDUMP tape
     3 => display CRASH INFO
     4 => Boot DDC configuration
     5 => Boot EEDB configuration
     6 => Boot STANDARD configuration
   Enter option [enter CLI] : 1
   CLI> x novram
   READ_NOVRAM_DATA.TYP
   READ_NOVRAM_DATA.VAL
   READ_NOVRAM_DATA.FIU
   READ_NOVRAM_DATA.SEQ
   READ_NOVRAM_DATA.M32
           
   Options are:
       0 => Exit.
       1 => Display novram contents.
       2 => Modify  novram contents.
       3 => Change TCP/IP board serial number.
   Enter option : 1
           Part   Serial  Artwork    ECO     Date of
   Board  Number  Number  Revision  Level  Manufacture
   IOC     49      10295    3        13     10-JUL-92
   VAL     0       0        0        0      ??-???-??
   TYP     0       0        0        0      ??-???-??
   SEQ     0       0        0        0      ??-???-??
   FIU     0       0        0        0      ??-???-??
   MEM0    0       0        0        0      ??-???-??
   RESHA   41      10272    3        13     24-JUN-92
   TCP/IP (CMC) board serial number is 1671
          
   Options are:
       0 => Exit.
       1 => Display novram contents.
       2 => Modify  novram contents.
       3 => Change TCP/IP board serial number.
   Enter option : 0
   CLI>
   Abort : Other error
   Console interrupt
   From CLI
   
   >>> NOTICE: the EPROM WRT PROT switch is OFF (at front of RESHA board) <<<
   >>> WARNING: the system clock or power is margined <<<
   CLI/CRASH MENU - options are:
     1 => enter CLI
     2 => make a CRASHDUMP tape
     3 => display CRASH INFO
     4 => Boot DDC configuration
     5 => Boot EEDB configuration
     6 => Boot STANDARD configuration
   Enter option [enter CLI] : 6
   --- Booting the R1000 Environment ---
   READ_NOVRAM_DATA.TYP
   READ_NOVRAM_DATA.VAL
   READ_NOVRAM_DATA.FIU
   READ_NOVRAM_DATA.SEQ
   READ_NOVRAM_DATA.M32
   READ_NOVRAM_DATA.SEQ
   READ_NOVRAM_DATA.FIU
   READ_NOVRAM_DATA.TYP
   READ_NOVRAM_DATA.VAL
   READ_NOVRAM_DATA.M32
   LOAD_HRAM_32_0.FIU
   LOAD_HRAM_1.FIU
   ALIGN_CSA.VAL
   ALIGN_CSA.TYP
   LOAD_CONFIG.M32
   CLEAR_TAGSTORE.M32
   CLEAR_PARITY_ERRORS.M32
   CLEAR_PARITY.FIU
   CLEAR_PARITY.VAL
   CLEAR_PARITY.TYP
   CLEAR_PARITY.SEQ
     Loading from file M207_54.M200_UCODE  bound on March 17, 1993 at 2:33:02 PM
     Loading Register Files and Dispatch Rams .... [OK]
     Loading Control Store .............. [OK]
   LOAD_BENIGN_UWORD.TYP
   SET_HIT.M32
   INIT_MRU.FIU
   CLEAR_HITS.M32
   PREP_RUN.TYP
   PREP_RUN.VAL
   PREP_RUN.IOC
   PREP_RUN.SEQ
   PREP_RUN.FIU
   CLEAR_PARITY_ERRORS.M32
   FREEZE_WORLD.FIU
   RUN_NORMAL.TYP
   RUN_NORMAL.VAL
   RUN_CHECK.SEQ
   RUN_CHECK.IOC
   RUN_CHECK.M32
   RUN_NORMAL.FIU

2021-02-07 The foundation of an Emulator

I have started at github project with a R1000 Emulator: [R1000.Emulator]

It uses [Karl Stenerud's "Musashi" 68K emulation] to emulate the IOC processor, and I have built skeleton I/O devices around it, and it gets surprisingly far, for a single weekends work:

    R1000-400 IOC SELFTEST 1.3.2
       512 KB memory ... [OK]
       Memory parity ... [OK]
       I/O bus control ... [OK]
       I/O bus map parity ... [OK]
       I/O bus transactions ... [OK]
       PIT ... [OK]
       Modem DUART channel ... Warning: DUART crystal out of spec! ... [OK]
       Diagnostic DUART channel ... [OK]
       Clock / Calendar ... Warning: Calendar crystal out of spec! ... [OK]
   Checking for RESHA board
       RESHA EEProm Interface ... [OK]
   Downloading RESHA EEProm 0 - TEST
   Downloading RESHA EEProm 1 - LANCE
   Downloading RESHA EEProm 2 - DISK
   Downloading RESHA EEProm 3 - TAPE
       Local interrupts ... [OK]
       Illegal reference protection ... [OK]
       I/O bus parity ... [OK]
       I/O bus spurious interrupts ... [OK]
       Temperature sensors ... [OK]
       IOC diagnostic processor ... [OK]
       Power margining ... [OK]
       Clock margining ... [OK]
   Selftest passed
   
   Restarting R1000-400S February 8th, 2021 at 11:53:44
   
   Logical tape drive 0 is an 8mm cartridge tape drive.
   Logical tape drive 1 is declared non-existent.
   Logical tape drive 2 is declared non-existent.
   Logical tape drive 3 is declared non-existent.
   Booting I/O Processor with Bootstrap version 0.4
   
   Boot from (Tn or Dn)  [D0] : 
   Kernel program (0,1,2) [0] :
   File system    (0,1,2) [0] :
   User program   (0,1,2) [0] :
   Initializing M400S I/O Processor Kernel 4_2_18
   IOP Kernel is initialized
   
   I/O Processor Kernel Crash: error 0806 (hex) at PC=000049F8
   Trapped into debugger.
   RD0 00000000  RD1 00000002  RD2 00000002  RD3 0000000A
   RD4 00000013  RD5 00000006  RD6 0000001E  RD7 00000005
   RA0 0000E800  RA1 0003FADA  RA2 00000954  RA3 0003FAD8
   RA4 0003FFF4  RA5 0003FA96  RA6 0003FB70  ISP 0000FAB8
    PC 0000A158  USP 0003FA96  ISP 0000FAB8  MSP 00000000  SR 2704
   VBR 00000000 ICCR 00000009 ICAR 00000000 XSFC 0 XDFC 0
   @

Some of the self-test routines ar skipped over, to postpone implementation of those I/O facilities until later.

The crash is after the KERNEL has initialized and called FS, which has zero'ed its data-segment and called PROGRAM.

/phk

2021-01-28 Disassembler found

A full text search for instruction strings revealed the machine code for a disassembler:

[⟦2fa0095f7⟧ Disassembler'code]

Comparing the result from this file with the previous guesses confirmed that this is a disassembler.

This has enabled decoding the remaining parts of the instruction set.

2021-01-01 Our first Rosetta Stones

A brute-force text-search at all bit-positions of all heap-segments, we have found our first "Rosetta Stones":

[⟦a564c4d78⟧ Numeric_Primitives'spec]

[⟦c5cdc1bc4⟧ Numeric_Primitives'body]

[⟦cb8e43375⟧ Numeric_Primitives'code]

and

[⟦05ee57b6d⟧ Trig_Lib'spec]

[⟦1329b5ea7⟧ Trig_Lib'body]

[⟦85b414c73⟧ Trig_Lib'code]

Both of them are very simple, and that has enabled us to unravel the basic Float and Discrete mathematical operators, as well as some comparison and flow-control operations.

Happy New Year

/phk