Rational/R1000s400/Logbook

2025-08-06 Tapes are fine, emulator maybe not ?

Trying to restore Environment backup tapes is not going well:

  ====>> Recovery <<====
  Positioning tape to Backup Index
  Processing Backup Index
     Processing Tape File: Vol Info
     Processing Tape File: VP Info
     Processing Tape File: DB Backups
     Processing Tape File: DB Processors
     Processing Tape File: DB Disk Volumes
     Processing Tape File: DB Tape Volumes
  Positioning tape to Backup Data
  Processing Backup Data
     Processing Tape File: Space Info Vol 1
     Processing Tape File: Block Info Vol 1
     Processing Tape File: Block Data Vol 1
     Processing Tape File: Space Info Vol 2
     Processing Tape File: Block Info Vol 2
     Processing Tape File: Block Data Vol 2
  
  *** Unexpected Exception:  <Exception: Unit = 914164, Ord = 14>, from PC = #914008, #128D
       Detected by Routine:  Recover_Save_Set
  *** Recovery was not successful ***

As far as we can tell from tracing, the emulated tape drive does the right thing and this exception happens after the tape reading has completed.

We know that it is possible to restore this backup tape, because we have done so previously, on the real hardware, in order to check that we did in fact have a good backup in our BitArchive.

We created a new backup tape, using the emulator, based on the slash-and-burn "snap08" disk image, and we got the same error.

We have changed a number of variables, for instance the snap08 image is only 250MB, so tried to restore that to a single disk system, still exact same error.

Since the hardware is stuck on a bad SRAM, next experiment is to launch release2 of the emulator on the "HW" schematics, and see what happens there.

First part of that was to update the schematics to KiCad9, but that was fortunately trivial.

2025-08-04 Now also emulating tapes

A couple of days have been spent getting the emulated SCSI tape to work better, and with the help of the Exabyte-8200 User's manual from bitsavers.org, we seem to have figured out how the SCSI 0x11 "SPACE" command should work.

At least work enough that the emulator is chugging along trying to read in the environment backup tape we created from the system on PAM's disks.

Our interest in the tape-drive is partly so we can explore the tapes PAM also donated, but primarily because we want to create a single-disk Environment to make the emulator a little bit less intimidating.

To do that we need to run the original disk image, delete enough stuff that it will fit on a single volume, write DFS and Environment backup tapes, and load them into an emulator which only has a single disk.

That is going to take some time.

But the emulator runs 30-40% faster than real hardware on 2% of the power, the emulated SCSI disks and tapes are nearly instantaneous, so not nearly as much time as on the real machine.

Even more importantly, we can write expect-send scripts for the emulator's console terminal, so we do not even have to be present while it chugs along.

2025-07-27 Making sense of things using the emulator

Having the emulator means we can now spelunk the R1000 system from the comfort of home, and as time and energy permits, we are starting to make sense of some things now.

All complex systems have their own vocabulary: Normal people "boot" their computers, but IBM "IPL" (=Initial Program Load) them, and I am sure other computer manufacturers have used other words.

When we started spelunking the data structures of the R1000, we had no access to the original Rational vocabulary for the lower levels of the system, and so resorted to stick likely words on the structures we dug out.

Being able to safely try things out in the emulator, provides us hints and glimpses of the official vocabulary, for instance by poking the kernel CLI interface:

  *Kernel: show_space_info
  VPID [266]: 8
  KIND [MODULE]: INSTRUCTION
  SEGMENT [2460516]: 4289
  SNAPSHOT_NUMBER [63969]: 1
  THE_SPACE        => ( 8,INSTRUCTION, 4289)
  COMMIT_TIME      => 1
  FAMILY           => (UP => 7, PROC => 0, SEQ => 779)
  PERMANENT        => TRUE
  IMPERMANENT      => FALSE
  VER_CONTROL      => COMMIT_BY_REQUEST
  GENERATION       => 2
  COMMITTED        => TRUE
  PAGE_COUNT       => 1
  DELETED          => FALSE
  USER_DATA        => 1: unknown
  OBJECT           => Manager 0Instance 1

(The missing space between '0' and 'Instance' on the last line is a genuine R1000 kernel bug)

Here we learn that what we called a "segment" is actually a "space".

In retrospect, we should have caught a clue sooner, because the backup tape has a file for each disk, named "Space Info Vol X" in the ANSI tape labels.

Poking the kernel CLI further, by we learn that they come in at least three fundamental types: "INSTRUCTION", "MODULE" and "IMPORT", and we can even locate the kernel code which produces that output in a code segment in the DFS filesystem.

And as luck would have it, a typing mistake revealed more useful information:

  *Kernel: show_space_info
  VPID [4]: 4
  KIND [IMPORT]: IMPORT
  SEGMENT [256302]: 4450606
  EXPECTED VALUES ARE:  0 ..  4194303

All computer geeks will recognize that number as 2^22-1, so the identity of a space consists of 10 bit VPID, 2 bit KIND and 22 bit SEGMENT.

We also spotted a number of different values in the USER_DATA field, and were able to locate those strings to a kernel code segment in the DFS filesystem, and from that derive the complete list.

Here it transpired that the 0xe3 value which we had assumed was "ada source code" is actually "Image: Permanent editor buffers", which explains why there are so (relatively) few of them.

But we also run into new mysteries: The "Space Info" files on the backup tape does not contain the actual snapshot number, called COMMIT_TIME by the kernel CLI, but it has a number which somehow translates to the snapshot number, at least for some of the spaces.

2025-05-28 And now for something entirely different

...A R1000 instruction simulator.

The disk image contain a code segment which the AutoArchaeologist names ⟦2fa0095f7⟧.

It is blatantly obvious that this object contains a disassembler for the R1000 instruction set.

When Allan made his amazing reverse engineering of the un(der)documented R1000 instruction set, this was one of the segments he exploited.

It is not a very big segment, and it uses a quite small subset of the R1000 instruction set, so instead of staring through the train window while long stretches of Scandinavia was getting rained at, we tried to cobble together a very crude python program which could execute those instructions.

It took a lot of guesswork, and a few ugly hacks, but it was possible and took less than 850 lines of python.

We do not trust all aspects of the output, but it did reveal a handful of typos and similar trivial mistakes in our current instruction list.

Between this new source, and the INSTRUCTIONS_SPEC file we found in the EEDB file-system, we now have two very credible sources which agree about almost all instructions, and we have started to revise the disassembler in the PyReveng3 project accordingly.

The discrepancies between the two sources show how the instruction set continued to develop in lock-step with the Ada-compiler's code generator, and we still cannot rule out, that the final version, as represented by the disk-images, use instructions our disassembler still have not learned about.

2025-05-17 Decoding the "pure" Rosetta Stones

Spring is a busy time, and not much is happening with the HW and the emulator.

But we have been spelunking the "archive_on_shutdown" diskimages we created long time ago, and this has been very productive.

Permanent objects in the R1000 system are controlled by "Manager daemons" and these store their internal state as objects on the disk.

A special facility controlled by Operator.Archive_On_Shutdown(bool) causes the manager daemons to save their state in a serialized form where many items are in string form. (See Guru Course 01 pg. 138)

On next boot, this string form of the state is read and saved into the regular form, which might now be different if a software update was applied.

The serialized state appears as '*.pure' files in the EEDB filesystem, and are quite a bit easier to figure out than the regular state segments (named '*.state')

At the end of the '*.pure' segments, after the actual state, there is a "schema" which we found out describes the regular '*.state' file layout, and while we have not nailed down all details of either of the formats, we have gotten a lot further.

We have now managed to manually trace our way from a filename, to Directory object, to a File object and from there to the segmented heap which contained the Rational copyright message.

Next task is to make the AutoArchaeologist present all this new information sensibly.

2025-03-11 Press ESC-ESC-O-H for Enclosing-in-place

We tried booting the original disk-images, and repeated the experiment from last week, but without a Facit Twist it is a bit cumbersome, since a modern machine doesn't have nearly as many keys on the keyboard, and in particular not laptops.

So we have created a cheat-sheet and type the necessary escape-sequences by hand, ESC-ESC-O-H for Enclosing-in-place, ESC-[-K for WINDOW and so on.

The short and the long is that we got the same constraint error, until we noticed that the files we were trying to edit where listed with an "A" which means "Archived".

Once we opened both the 'SPEC and 'BODY for editing, then we could promoted them both to installed with no problem.

So maybe that Constraint Error is simply all the Environment has to say to somebody who tries to do something utterly crazy ?

We'll find out next time we get the hardware running.

But we have been thinking, and this made us think about it again, that it would be neat to have a way to script scenarios and experiments in the emulator.

There are libraries, for instance libteken which can interpret the escape sequences the r1000 sends and render the screen image correctly. And sending the proper escape sequences is obviously not a problem either.

The hard part is how a "send-expect" style scripting language can work, and if the scripts end up being write-only or if one can actually figure out what they do by reading them.

For trivial text-screen-oriented applications, say the vi(1) or EMACS editors, it is a tangible problem, in the sense that their output is 100% predictable based on the key-presses they receive.

But the r1000 starts snapshots asynchronously, and it chooses which of the three window areas to use, with a predictable but quite complex algorithm based on the contents of the window stack.

It would be neat to have, but probably not to implement.

2025-03-06 So does it actually work ?

The wisdom of »First make it work, then make it faster« is not up for debate.

Yet in this project we had to do the precise opposite, because, as John Aynsley from Doulos warned us, before we even embarked on the SystemC emulation, we saw "kiloHertz not megaHertz".

Yesterday we hooked the Facit Twist in the museum to the emulator back home, through:

A 9600 serial line
A USB-serial adapter
A Raspberry Pi 3
A ssh connection across
…A wired ethernet
…A wireless access point
To a laptop with a flaky wireless connection
Another ssh connection
…Though a TINC VPN
…Across the same flaky wireless connection
…Through more layers of firewalls than reasonable
…100km through Danish ISPs
…To another firewall where the TINC VPN terminated
…Through a few meters of CAT7
To another machine
A Telnet session to the tcp socket on the machine running the emulator.

This was not optimal from a response-time point of view.

Instead we launched the emulator on our "exhibition server", a decade old Intel Sandy Bridge we are going to use to run "the remote end" of our "modem-age" exhibit: BBS, ISP, dial-in service etc.

The server doing nothing else, managed to run the emulator a third of hardware speed, which was still usable.

Hoping to eventually speed up booting more, by not wasting time on the nonexistent ENP-100 network co-processor, we tried replacing the entire substantive contents of the !MACHINE.INITIALIZE_DTIA'BODY program with null;, but when we hit Promote we got:

 Promote failed - Unhandled exception: Constraint_Error (Variant), from PC = #46300E, #1435

We tried adding a null; to the one already in !MACHINE.INITIALIZE_MAIL'BODY, and got the exact same result when promoting.

We tried Operator.Enable_Privileges

Same result.

To be honest: We have no idea if that was even supposed to work, but we would at least expect a more "polished" error message, so we assume that something does not work as it should.

"Something" is almost certainly the emulator.

But just in case this is fall-out from the Genghis Khan-style "cleanup" we performed on this disk-image, we will repeat the experiment on the original disk-image.

It would have been nice to just fire up the real machine on the same disk-image and see what should have happened, but we need to catch up with the SRAM epidemic first.

2025-02-25 Well that escalated fast…

We now have a version of the R1000 emulator without SystemC and it runs at 73% of hardware speed, and boots to idle in 37 minutes.

We did totally did not expect that two days ago.

The R1000 part of the emulation runs so fast that the simulated IOP often panics. It looks like an interrupt priority/simultaneous interrupts problem, and we're testing fixes for that.

2025-02-23 Ready to ditch SystemC

The last few weeks we have turned the SystemC "signals" into C/C++ variables, and reduced the "sensitivity" of the "components" to just their clock signals:

This has made the emulator much faster: Booting to "idle" in 100 minutes is only five times slower than the real machine, and quite usable.

Better yet: We can get rid of SystemC now: We have a C++ class for each board, and we can just call their "clock" methods in sequence and we should be good.

Until now SystemC emulation of signals and propagation delays have totally dominated performance, but these last couple of weeks the C/C++ code started to matter too.

Once SystemC is gone, it all comes down to the C/C++ code, and making that run five times faster may be a tall order.

2024-12-31 7% greener R1000

The emulator runs only 2.32% of the hardware speed, but uses only 2.17% as much electricity.

QED: The emulator is 7% more energy efficient than the hardware.

Over the midwinter break, the emulator has been put through a couple of longer tests than the usual "does it boot?" test.

First we let it boot the original disk images from PAM's machine which took 33 hours.

Second we did it again, but this time with writable disk images, and then we shut it down after enabling:

   package Operator is:
   […]
   procedure Archive_On_Shutdown (On : Boolean := True);
   function Get_Archive_On_Shutdown return Boolean;
   -- Archive_On_Shutdown causes the next shutdown to store internal
   -- state in 'archive' form, allowing upgrades and conversion of
   -- internal data structures.  It typically takes several hours to
   -- complete a shutdown or restart with archive conversions.

That run lasted 62 hours, and ended with the system shutting itself down as expected.

We have started analyzing those disk-images, to see if we can learn more about the internal state from the 'archive' form of the metadata, and some clues have emerged.

2024-12-01 Going downhill!

If you look at the table above this, you will notice that the software emulation is getting much faster faster.

We are now in "minute for second" territory, where the software emulation takes a minute to simulate a second of machine time, we started out in the "hour for second" wilderness.

Schematics are reduced to essentially a single SystemC component for each board, and we are in the "where is this signal needed and by when in the clock cycle is it needed?" which allows us to optimize out man updates of signals and avoiding relatively complex code execution.

One example is the rotator part of the FIU board, conceptually it is quite simple, but the RUN_UDIAG microcode is written to test how it is implemented, so as long as we want to use RUN_UDIAG, we have to implement it that, even though it is fairly complex to do so in software.

But we only need to execute that rotator code when something else needs the output of the rotator, and optimizing for that just cut 15 minutes of our 21 hour baseline.

That is a respectable 1.2% speedup, and compound interest being what it is, we only need to find another 206 of those, before we will have speed parity with a 44 year old computer.

2024-10-23 Everything is simpler in software

We have reached a bit of a milestone with the emulator:

When we started there were almost 800 components on 46 pages of MEM32 schematics, and that was even after we cheated and left out all but one set of the inverter drivers for the signals to the DRAM banks.

Now we are down to a single component, which does all the actual work plus the diagnostic processor which does nothing but exist.

To add insult to injury, the source-code for that single component is only 532 lines in our Python/C++ chimera language.

The TYP and VAL boards have had almost all functionality merged into a single component as well, 758 and 545 lines of code respectively.

IOC, FIU and SEQ are putting up more of a fight.

One of the runs in progress will probably come in around 36 hours to login.

2024-10-01 faster and faster

The changes we make in the emulator at present are subtractive, two component activations per micro cycle here, another one there and so on.

But because the total number of activations are now down to around 400 per micro cycle, each one saved matters more and more in terms of overall performance.

The state of play right now is:

act/µc	%	board
14.000	3.47	emu
34.944	8.67	typ
40.956	10.16	val
52.599	13.05	ioc
59.520	14.77	mem32
84.113	20.87	fiu
116.871	29.00	seq
403.000	100.00	Total

Still ways to go…

2024-08-31 below 72hours

The latest integration run, booted in 71½ hours - less than three days and 33% faster than two months ago.

2024-08-11 Making progress

As can be seen from the table above, we're making some serious progress with the emulator performance now.

The current effort is ditching as much of the diagnostics as possible, retaining only what is necessary to run FRU_P2UCODE, FRU_P3UCODE, RUN_UDIAG and to boot the environment.

With diagnostics out of the way, a lot of "condensation" becomes possible.

We have condensed each board onto a single schematic sheet, starting around a square meter, but rapidly shrinking.

On the test-bed right now is a branch where the entire VAL datapath has been condensed into a single "chip":

As a result, the VAL board now uses 30% fewer SystemC activations than the TYP board.

The next step is to integrate the RF address generator, which currently calculates the addresses more often than necessary, and multiplexes the C address onto the A+B address busses during H2.

But first, we optimized how the C-address was calculated, and the resulting C++ code looks a lot like it implements the microcode encoding as described on the front page of the VAL schematic, but there are a couple of twists relating to the CSA (Control Stack Accelerator), which were not on the VAL schematics.

We also have similar improvements to MEM32 and SEQ in the pipeline, they will all hit the table above, after they complete their test-runs.

2024-07-13 The Scavenger

The Memory-Monitor, which lives on the FIU board, is the control-logic common to all the memory boards in the system.

A part of the Memory-Monitor is a circuit called "Scavenger" which is described in the 1982 document Functional Specification of the Memory Monitor on pdf page 24.

The 1984 document R1000 Hardware on pdf page 70 says:

The memory monitor also has a scavenger RAM (page 61). This was intended to provide a garbage collection scheme for the memory manager. However, that scheme was abandoned and the scavenger RAM is no longer used.

We have just experimentally found out, that "is no longer used" is not the same as "can be removed".

A emulator run without the scavenger runs fine until the Virtual Memory has been started, and then it keels over with:

  23:15:55 !!! EEDB Assert_Failure Subsystem_Map Inconsistency found in
               Reset_Kernel_Node
  23:15:55 *** EEDB.Init Format_Error Exception = <Exception: Unit = 3586580, Ord
               = 1>, from PC = #163C13, #3AF
           *** Calling task (16#C985404#) will be stopped in wait service
  23:15:55 !!! Internal_diagnostic (trace) ** Start of trace **
  23:15:55 !!! Internal_diagnostic (trace) Task: #C987804
  23:15:55 !!! Internal_diagnostic (trace)     Frame:  1,  Pc = #92013, #64D
  23:15:55 !!! Internal_diagnostic (trace)     in rendezvous with #C985404
  23:15:55 !!! Internal_diagnostic (trace) Task: #C985404
  23:15:55 !!! Internal_diagnostic (trace)     Frame:  1,  Pc = #163C13, #151B
  23:15:55 !!! Internal_diagnostic (trace)     Frame:  2,  Pc = #163C13, #1B52
  23:15:55 !!! Internal_diagnostic (trace)     Frame:  3,  Pc = #163C13, #19B1
  23:15:55 !!! Internal_diagnostic (trace)     Frame:  4,  Pc = #92813, #C8
  23:15:55 !!! Internal_diagnostic (trace)     Frame:  5,  Pc = #92813, #D6
  23:15:55 !!! Internal_diagnostic (trace)     Frame:  6,  Pc = #92413, #E4
  23:15:55 !!! Internal_diagnostic (trace)     Frame:  7,  Pc = #92013, #CC
  23:15:55 !!! Internal_diagnostic (trace) ** End of trace **

But other than that, as can be seen in the table above, the emulator is getting faster. We have pulled out all the parity error checks, but not the ECC checks, and we have started to eliminate the diagnostic circuitry, retaining only what is necessary to run FRU::P2UCODE, FRU::P3UCODE, RUN_UDIAG and the Environment.

2024-07-04 Improvement on second IOC board - back to state as of 2023-11-16

We found the problem that gave us a set-back on the second IOC board. While replacing one of the chips, the IO.DCLK line was damaged so it no longer provided a signal to 5 latches. We have soldered an extra wire on to fix this, and the board now boots like it did back in 2023-11-16.

So this IOC now passes self test, again, but fails to communicate properly SCSI over the UniBus interface.

At least the set-back has been reset now...

2024-06-15 End of no-parity experiment

For now we have come to the end of the experiment where we got rid of all parity and ECC checking.

It speeds the emulator up by 20-ish percent, which is nothing to sneeze at, but we have concluded that we are not yet ready to continue without the diagnostic subsystem.

2024-06-13 n_running_r1000++, and perhaps a lead on IOC2 trouble

Finally got time to take a look at the R1000. With one of our previously hacked BIOS the R1000 complained about "G6".

So after a little work with a side-cutter and a soldering iron, a new SRAM-chip on G6 was in place and the IOC was yet again happy about Life, the Universe and Everything :-)

Another try to identify the illusive problem on IOC board 2 might have given something:

The trigger is when BYTE0 is asserted causing a BERR and in turn the M68K to trap. The trace hints that SRAM G47 (DQ1 in bank 0) has become slow. DQ1 goes high just after CLK.2X clocks and that causes the parity on BYTE0 to fail. Several other traces shows the same pattern, DQ1 is much slower than the other lines. Next time we will replace G47, fingers crossed!

2024-06-09 Parity elimination

The R1000 computer conservatively stores a parity-bit with any byte stored in a RAM chip or passed over the backplane or foreplane busses, and as a general rule the parity bits are passed through rather than recalculated. The DRAM storage on the MEM32 boards use a 9-bit ECC code instead.

All in all, that takes a fair bit of logic in the hardware and in software it gets even slightly worse because calculating the parity is a slightly expensive operation.

We do not expect the simulated RAM chips or busses to have bit-errors, so all this is surplus to requirements for the emulator, but removing it to speed things up, comes at the cost of many tests failing, because the validate the parity-circuit.

For the last couple of weeks we have worked to eliminate all the parity-checking in a separate branch, having as goal that run_udiag and booting the enviroment are the only two things that must work.

We had to modify that goal slightly, because run_udiag actually tests the ECC circuitry near the end, so instead of ending in success, it now ends with code 29F2 - ECC_EVENT_NOT_TAKEN.

With all that said, the run which just ended now took only 106 hours, 315 times slower than hardware.

The run we started instead seems to be a percent faster or so.

No news on the hardware front, everybody is busy.

2024-05-07 Gentlemen: Start your FPGAs

We have branched Release2 of the github project to celebrate the fact, that we have simulated the HW-true schematics all the way to login.

That took 59 days and 19 hours on the MacBook M2.

The point of this branch is to provide a well documented launch-pad, should anybody want to create a FPGA based Rational R1000.

2024-04-26 n_running_r1000--

When we tried to boot the working R1000 yesterday, we got IOC RAM errors :-(

2024-04-15 Spring has sprung

And that means a boatload of distractions, so activity is a bit low right now.

On the MacBook M2 the hardware schematic simulation is chugging along, 4000 times lower than hardware (1250 Hz instead of 5 MHz), but it has gotten to CMVC.11.8.2D and in another three weeks it should be at the login prompt.

We are a bit stuck on the IOC#2 board, the best current hypothesis is that the different SRAM speeds causes the byte parity lines to flutter at an inconvenient time, but we do not have enough 25ns chips to verify that theory, and DIP versions of that SRAM are hard to come by.

Given the prospect that more SRAM chips are guaranteed to die on us, and the fact that surface mount versions of CY7C187 are still obtainable, making a small adapter board, for instance with three chips per board, looks more and more attractive.

2024-03-27 Integrated simulated MEM32

We have started to implement the MEM32 board's core functionality as a single integrated SystemC/C++ class.

The first stage is to implement the tag-RAM logic so the contents is maintained identical to the working "discrete" simulation of the MEM32 board.

So far we are at around 350 lines of code, with a lot of duplication in order to have the same precise memory layout as the discrete model, so we can compare them with memcmp(3).

The Functional Specification of the Memory Monitor has been of good help in this, and it has mostly been a matter of waiting for the test-runs to crash, and then look at the end of the 100+GB trace-file to find out why.

The second stage will be doing the same for the DRAM arrays, we do not anticipate any challenges here, once the tag-logic is ironed out, this should follow trivially.

The third stage will be to migrate the MEM32's output signals from the discrete model to the integrated model.

We do not have any documentation which explains the meaning and timing of the control signals, so some Prototyping™ (The methodology formerly known as "Trial&Error™") will be involved.

The fourth and final stage will be to eliminate the discrete simulation to reap the "integration-dividend".

This will mostly be about simulating enough of the Diagnostic Archipelago to keep the programs on the IOP happy.

2024-03-19 and more debugging

The following measurements have been done with cut termination resistor pin of A22 (no line terminator).

Last session lagged measurements of some of the SRAM pins. VCC between H2 and H19 are 4.91-4.95 with ripple at average 140mV, but increases up to 450mV every 12.8us (78.1kHz) for about 100ns.

Other observations:

Pin 10 /WE switches at 10MHz continuously (/CE not asserted) - The superimposed signals on A22 is at the same frequency, and it is not unlikely that the lines of A22 and /WE is going close by each other.

Q-output of H40 (DQ8) has a slightly different shape than on any other of the 72 SRAMs. Interesting enough, the noise stick with H40 even when exchanging the SRAM with on from another position.

DQ8

DQ!8

SRAM output while not driven is at 1.8-1.9V.

2024-03-14 and debugging

Assuming the IOC2-problems are rooted in SRAMs causing parity-errors at times where they are not meant to be tested, we took on to systematically measure the IOC2 SRAMs to find out if they are working within specs, starting with the address lines. Address-wise the SRAMs are organized in two banks of 256KB, a low and a high bank, 512KB in total. Physically the SRAMs are grouped in 4 blocks. Each block with low and high bank of the same Byte - 16+2 SRAMs in each block sharing the same address lines (separate Chip selects):

*Table 1*
Physical SRAM layout (Bank / Byte)
row/column	H	G
41-48	H/1	L/0
33-40	L/1	H/0
31-32	Parity 1	Parity 0
18-19	Parity 3	Parity 2
10-17	H/3	L/2
2-9	L/3	H/2

First set of measurements were done on address lines on the physical ends of each block.

*Table 2*
Pin	SRAM address line noise, Pk-Pk (mV)								Line
Pin	H2	H19	G2	G19	H31	H48	G31	G48	Line
1	170	220	180	170	240	310	185	250	A14
2	140	210	130	210	175	135	155	205	A15
3	165	210	85	135	140	130	170	200	A16
4	200	200	170	180	145	150	175	115	A17
5	140	165	125	170	175	180	210	130	A18
6	175	145	110	200	175	150	165	140	A19
7	105	210	145	200	150	115	205	130	A20
8	190	135	200	140	165	115	210	190	A21
14	775	365	570	275	180	445	195	275	A22
15	230	340	130	95	160	185	205	190	A23
16	125	150	130	150	170	120	175	165	A24
17	150	150	115	135	240	232	230	150	A25
18	130	200	90	160	150	145	200	170	A26
19	195	180	160	155	165	170	195	205	A27
20	135	195	115	135	140	130	245	185	A28
21	130	190	110	150	135	130	175	195	A29
	Bank 0		Bank 1		Bank 2		Bank 3

(The measurements should probably have included all pins and not just the address lines...)

In an attempt to locate the source of the noise on A14 BYTE3, a second set of measurements were done. The below values are delta 10mV based on a "manual" average of what was measured on the scope:

*Table 3*
Noise on pin 14
Chip	Pk-Pk (mV)	Is new
H2	770	*
H3	760
H4	730
H5	720	*
H6	700	*
H7	670
H8	660	*
H9	630
H10	610	*
H11	580	*
H12	540
H13	520	*
H14	480
H15	450
H16	420	*
H17	400	*
H18	370	*
H19	350	*

Clearly, least noise at the drivers at the center and most noise toward the termination resistors at the edge of the board. Same pattern on the other banks, although not as pronounced (Table 2).

The termination resistors are DIL16 chips R220/330 serving multiple lines. A line next to the A22 is a CLK.2X line working at 10MHz. Zooming in on the A12 noise, it is turns out that the noise is actually at CLK.2X frequency.

This lead to the last test done: cutting the termination resistor pin of A22 to see if the resistor is somehow responsible for the noise - It is not; the chip-side becomes nice and steady, while the board side (A22) noise remains, although at a lower mean level. The line is actively driven low by the driver.

Earlier, while exchanging one of the SRAMs our de-solder machine spewed out some solder, which we hoped to have cleaned up, so a close-in re-inspection was done, without finding any remains.

We tried to remove all the new SRAMs on BYTE3, but the noise remains.

Solder residue still on the board after the spillage? - Table 2 does not seem to agree (identical patterns on other banks).
Solder between A22 (pin 14) and a neighbor pin on one of the exchanged SRAMs? - Not likely, pin 13 is data in which is stable and pin 15 is A23 which is much less noisy than A22.
Defect termination-resistor chip? - The pin-cut-test does not support this.
Defect driver? - Wouldn't the noise amplitude be higher closer toward the driver?
New SRAMs radiates the noise? - No, we tried to remove these, same noise.
Old SRAMs radiate the noise? - Table 3 does not support this.

But then what???

2024-03-09 Still debugging

We're still debugging IOC2 which have done us the favour of failing much more predictably.

Current hypothesis is that when the M68K CPU writes to registers in UNIBUS space, the IOC is designed such that the parity information put on the UNIBUS comes from the RAM on the IOC board, even though that RAM takes no part in the UNIBUS transaction.

During the EEPROM based selftest, this parity information is read back and causes a bus-error signal to get raised, which the EEPROM has not yet prepared for.

One possible explantion for why we see this, is that we have replaced the defective 64Kx1 SRAM chips with "NOS" chips which are slightly slower (35ns), so the parity-error signal may take longer to stabilize.

Work continues on the Emulation, where we are both speeding up the "Optimized" schematics and working to get the HW-true schematics to pass all tests and boot again.

2024-02-19 T-12y and counting

If we can speed up the emulator by one percent every week, it will run as fast as the actual hardware in a little over 12 years.

And since 1%/week does seem to be our rate of improvement right now, it is probably time for a new strategy.

Until now the optimizations have been conditioned on keeping everything working, including all the diagnostic facilities, but this is increasingly becoming a headache for us, because the diagnostic facilities, sensibly, test the implementing circuits rather than the desired architecture, and they often test the implementing circuits in ways the desired architecture does not use them.

To give a concrete example:

The architectural view of a micro instruction on the TYP board may be: "Take whatever is on the TYP bus, and store it at location X in the register file"

The implementing circuitry needs to put the TYP bus on the A or B bus to the ALU, tell the ALU to pass that bus through to it's output where it is put on the ALU bus, from there to the C bus, and then setup the right address for the register file and strobe the Write Enable signal to the actual register file RAM devices.

But in the implementing circuitry, the C bus does not go to the register file RAM devices, only the A and B bus does, so there is an extra step of gating the C bus to both the A and B busses, because both sides of the register file needs to be written.

Because the diagnostic tests test the implementing circuitry, they test that the TYP bus can be placed on the A and B busses, that the C bus can be placed on the A and B busses, and that the ALU can pass a value through unharmed.

If we try optimize things by using true dual-port register file RAM devices, which is what the architecture calls for and get rid of the extra bus-switch-yard, which the reduction of the architecture to implementing circuitry brought into existence, the diagnostics will fail.

So it may be time for us to give up on the low-level diagnostics, and rely only on the micro-code based diagnostics going forward.

We will try out this idea on the MEM32 board, because it is, all things considered, very simple and it has no microcode.

2024-02-04 Lookin' good

The museum was open to the public today and we used the real Facit terminal to log into the emulator:

Seems to work, as expected, but it is slooooow…

2024-01-27 And we have liftoff!

Ladies & Gentlemen!

The R1000/s400 software emulator works:

  09:19:31 +++ Operator Enable_Terminal  244
  09:19:31 +++ Operator Enable_Terminal  245
  09:19:31 +++ Operator Enable_Terminal  246
  09:19:31 +++ Operator Enable_Terminal  247
  09:19:31 +++ Operator Enable_Terminal  248
  09:19:32 +++ Operator Enable_Terminal  249
  09:19:32 !!! Machine_Initialization_Start Exception_While_Starting
               TMS_Elaborate in context !Machine.Initialization.Rational
  
  ====>> Elaborator Database <<====
  EEDB:
           
  ====>> Console Command Interpreter (System Job 223) <<====
  username: pam
  password: 
  session: s_1
  99/01/22 09:20:04 --- pam.s_1 logging in.
  
  ====>> Ci.Interpret (PAM.S_1 Job 221) <<====
  command: what.users
  
  ====>> What.Users (PAM.S_1 Job 219) <<====
  User Status on January 22, 1999 at 9:20:13 AM
  
       User       Line  Job   S      Time     Job Name                           &
  
  ==============  ====  ===  ====  =========  ===================================&
  ======
                                              
  *SYSTEM            -    4  RUN   13:54.446  System
                          5  RUN   13:54.447  Daemons
                        223  IDLE     29.439  Console Command Interpreter
                        246  IDLE     53.480  Rational_Access Commands Rev 1_0_2
                        248  IDLE     57.840  Print Spooler
                        250  IDLE   1:03.681  Smooth Snapshots
                        253  IDLE     58.227  Ftp Server
                                              
  PAM                -  219  RUN       1.377  WHAT.USERS
                        221  IDLE      4.966  CI.INTERPRET("!machine.device ..., &
  ...)
                                              
  NETWORK_PUBLIC     -  225  IDLE     37.757  Archive Server
  
  ====>> Ci.Interpret (PAM.S_1 Job 221) <<====

The bad news is that it takes an hour just to login…

2024-01-27 … and warmer

  NATIVE_DEBUGGER.11.2.0D
  CROSS_DEVELOPMENT.DELTA
  INITIALIZE.11.2.4D         
          
  ====>> Environment Log <<====
  09:17:25 +++ Operator Enable_Terminal  16
  09:18:55 !!! !Machine.Initialization.Rational.Dtia
               Unable_To_Start_Dtia_Rpc_Server ERROR  Activity does not define a
               valid load view for the subsystem of !TOOLS.DTIA_RPC_MECHANISMS.
               REV11_4_SPEC.UNITS.TARGET_INTERFACE.ELABORATION'SPEC

2024-01-27 Getting warmer

  TOOLS_INTEGRATION.DELTA    
  CMVC.11.8.2D               
  DESIGN_FACILITY.DELTA
  ARCHIVE.11.4.0D            
  NATIVE_DEBUGGER.11.2.0D
  CROSS_DEVELOPMENT.DELTA
  INITIALIZE.11.2.4D         
          
  ====>> Environment Log <<====
  09:17:25 +++ Operator Enable_Terminal  16
  
  ====>> Elaborator Database <<====
  EEDB:

2024-01-26 Thinking ahead

While we wait to see where the emulation croaks next time, it is a good time to take a step back and think ahead.

The primary goal of this entire project was to produce a software emulation of the R1000/s400 computer, so that the uniqueness of the computer, and of the Rational Environment will not be lost to humanity, when the last hardware finally releases the magic smoke.

The question I want to disuss here is a bit like »Let's go see Rome in our vacation.«

There may be only one set of longitude and latitude, but there are dozens of Romes at that location, The Pope's Rome, The Cæsar's Rome, The Facists's Rome, The Independent Rome, The culinary Rome, The Opera Rome and so on. Which one, or which ones, of those Romes do you want to experience, and how long is your vacation ?

Likewise there are many R1000's to emulate: There is the R1000 running the Rational Environment, there R1000 implementing the Rational Architecture, there is the R1000 son-of-son of the original four-CPU monster s100, there is the R1000 with the amazing diagnostic abilities, there is the R1000 with type-checking in hardware, there is the R1000 with distributed microcode, there is the R1000 which is bit-oriented, there is the R1000 without linear address-space and so on. How much does the R1000 interest you ?

Because we were forced to implement the R1000 emulator from the hardware schematics, we have ended up with a software emulation which allows one to study all these aspects of the R1000, but at glacial speed: I have not tried running the "Hw" branch schematics for a long time, but last I tried they were several thousand times slower than the hardware.

If you want to understand how the ECC circuitry and hardware works you can probably live with that, if you want to demo the Rational Environment's semantic view of large Ada Projects … not so much.

What we have right now is the "Hw" branch, which supposed to be chip for chip, wire for wire, identical to the two R1000/s400 computers we have in Datamuseum.dk, except for the RESHA board. If you want to study ECC, this is the one you want.

The other branch is the "Optimized" branch, which sports weird and complex chips, some of which contains 1/3 of the FIU board and other which have weird back-door connections to the emulator to load the microcode faster. This branch currently only runs 420 times slower than the hardware, and that is neither fish nor fowl, but it may transpire that it actually works.

If so, it will be perfectly defensible to stop here, we have reached the goal we set ourselves and we can leave the performance issue as an exercise for future generations.

But that would be neither fun nor satisfying, so what do we do instead ?

One possibility would be to validate that Hw branch also works, and then convert that into VHDL or Verilog, synthesize it to a FPGA and create a 1:10 scale-model of the R1000/s400.

For many reasons, mostly lack of skill, I cannot do that. I you want to, I'll do everything I can do to help you, because I'd love to have one on my desk, but I cannot do it.

One possibility is to continue to file away on the "Optimized" branch, shaving a percent performance here, and a percent performance there. I can do that, and if I can shave 1% every week, I will reach speed parity with the hardware … in 12 years. There is a good chance that at some point I will understand the hardware well enough to get down to six compoents: IOC, FIU, SEQ, TYP, VAL & MEM32, but they will no longer teach you anything about the actual hardware of the R1000, in fact the "Optimized" branch already does not do that.

Another option is to use the observability provided by the "Optimized branch" to start implementing a "traditional" software emulation at the macro-instruction level. That will undoubtedly give us the fastest running emulator, but it will be tough going, because we need to understand both the hardware and the microcode to find out what the instructions actually do.

Going down a layer, we can implement a traditional software emulation at the micro-code level. This has the benefit that we do not need to understand the microcode instructions in relation to each other, or indeed anything about the macro instructions or Ada, we just need to execute all the actually used micro instruction words correctly. I expect such a simulator to run at least several times faster than the hardware, even on a tiny computer like a RaspBerry Pi, and to ensure the correctness of that emulation, it can be compared directly bit for bit with the Optimized branch or even the Hw branch.

In terms of preservation, in the museum sense of the word, there is no doubt that the Hw branch is the real deal.

In terms of presentation of the Rational Environment, the Hw branch is useless, and an FPGA solution will have to be "refreshed" every couple of decades, because sooner or later the magic smoke escapes.

A microcode emulation would allow presentation of the Rational Environment with high fidelity, and if written in a stable programming language, it can last forever.

So I should probably start to think about that…

/phk

2024-01-25 Finally getting somewhere

2024-01-23 Two down, two to go

We are running to test-runs in parallel, and they both passed PRETTY_PRINTER just now:

We have seen that once before, but this time we think we know why they made it.

The real test is in 24 hours, when they reach OS_COMMANDS, which we have never managed to get past.

2024-01-22 Well, that would do it...

The python script delivered its verdict overnight, and … ehh … duh ?

  scsi_d 1 WRITE_6 0a000ec003000000
  scsi_d 1 READ_6  08000ec001000000
  scsi_d 1 READ_6  08000ec101000000

The hex-strings are SCSI CDBs and the '03' in the WRITE_6 command means that three sectors are transferred.

That was news to us, until now he had not seen, or at least, not noticed, the R1000 CPU using anything but single sector transfers.

But we do support multi-sector SCSI transfers, DFS uses them when it push/pull's programs for instance, so that in itself should not be a problem.

However the trace reveals something we have not seen before:

  mailbox f 4080000f 81020033 0f000103 00050852
  mailbox d 4080000d 81020134 0f000103 00050852
  mailbox c 4080000c 81020235 0f800103 00050852
  scsi_d 1 WRITE_6 0a000ec003000000

The three mailboxes are out of order and non-contiguous: f-d-c.

Each mailbox has an attached 1Kbyte buffer, so if the mailboxes had been in c-d-e order, it would have worked fine.

In other words: Scatter/gather disk-I/O.

That, finally, explains what the "IO BUS MAP" on page 23 and 24 of the IOC schematic is good for.

Fortunately the fix looks trivial:

   -       dma_read(sd->ctl->dma_seg, sd->ctl->dma_adr, ptr, len);
   +       for (xlen = 0; xlen < len; xlen += (1<<10)) {
   +               dma_read(sd->ctl->dma_seg, sd->ctl->dma_adr + xlen, pp + xlen, 1<<10);
   +       }

and similar for write.

This is going to be an interesting and slightly tense week…

2024-01-21 Aha!

Armed with two binary traces, one from a run which stopped at PRETTY_PRINTER (6GB) and one which managed to pass that (11GB), we have now found out that the failing run reads sector 0xec1 from drive 1 and gets all zeros back, whereas the good run reads the same sector and gets non-zero data back.

This would not be inconsistent with Exception 16#20 being "Numeric Error (zero divide)" according to pdf page 69 in Guru Course day 1.

The non-zero data in the good case is not identical to the content of the disk-image when we start the emulator, so it must be data written during the run.

A python script is now trudging through the smaller trace to find such that disk write operation.

2024-01-11 IOC2 goes bad(er)

Yesterday we went through our TODO list and were thrown off at several points along the way.

For instance, it transpired that the disk images we preserved from the disks in PAM's machine did not include the three "diagnostic cylinders at the end", they had only 1655 of the disk's 1658 cylinders.

Until we ran the DISKX.M200 program this did not matter, and we never noticed, and because the emulator correctly emulated the disks as having 1658 cylinders, it had not problem with DISKX.

But we configured the SCSI2SD disk emulator from the size of the images, so it presented only 1655 cylinders in the "mode pages" and when we tried to run DISKX on the hardware, the kernel panic'ed when it saw I/O requests to cylinders past the end of the disk, as it should have.

The visual inspection found no relevant differences between the two board, and the handwritten correction to IOFF0A on the schematic must have predated the s400, because it was incorporated in the PCB layout.

Next we soldered connections to some of the interesting chips to the solder-side of the IOC, removed all the other board from the machine, to make space to get to the connections, and tried to make sense of things.

But by now the board did not even complete the IOC EEPROM self-test, it failed with the RED led on the front-panel indicating that the M68K had halted itself. Eventually we could show that the two of the byte parity signals from the IOC RAM were noisy, even in this state where nothing were accessing them, and we decided to call it a night. If we can reproduce that noise with the board on the workbench, it should be a quite easy to track down where the noise comes from. It could be the 74F280 parity generators, but it is far more likely to be another couple of 64Kx1 SRAM chips heading for the eWaste bin.

The emulator now reaches PRETTY_PRINTER in 38h43m.

2024-01-09 IOC2 in the memory wringer

Happy New Year!

This last sunday the museum was open, and we used the chance to give the SRAM on IOC2 a really good workout with the M68K CPU.

Using the excellent "vasm" by Volker Barthelmann et al, we have created an adhoc tool-chain which allows us to write test-code in M68K assembler and get commands out, ready to be pasted into the IOC's builtin low-level debugger.

The low-level debugger's commands are described on the first page of The IOC Schematics and not too bad to use manually, once you memorize them, and the addresses you are working with, but for entering even trivial short programs, cut and paste directly from the tool-chain is much more convenient.

But efter beating the RAM up for several ours in various creative ways, we found no reason to think the M68K is able to trigger the parity-related or -adjecent situation which causes huge disk-writes to stall.

We also tried another interesting experiment, we overwrote PROGRAM_2.M200 with DISKX.M200, this is pretty trivial to do since the DFS filesystem lays files out in contiguous sectors, and the RPi we use as console server has a "backdoor" USB connection to the SCSI2SD disk emulator board. With sector numbers where the file start from the emulators "dfs" CLI command, good old dd(1) will do the job.

DISKX.M200 is a "DFS based disk exerciser", described in not too much detail on page 27 of the Command Interfaces manual.

By putting DISKX in PROGRAM_2's place, we can boot the system and via the EEPROM prompts load DISKX without ever doing a DFS "PUSH" and thus without doing any of the large disk writes which hangs.

This overwrite trick is very handy, and we checked that it also works for CLI.M200, and for that matter, we can patch KERNEL.M200, FS.M200 or even write our own programs to run.

But back to DISKX: When run this way in the emulator, it works fine, run on IOC2 it fails hard and fast:

  Initializing M400S I/O Processor Kernel 4_2_18
  Disk  0 is ONLINE and WRITE ENABLED
  Disk  1 is ONLINE and WRITE ENABLED
  IOP Kernel is initialized
  Exercize unit 0 [Y] ?
  Exercize unit 1 [Y] ?
  Disk unit => 0, using cylinders [1648..1654]
  Disk unit => 1, using cylinders [1648..1654]
  DFS based
  I/O Processor Kernel Crash: error 066D (hex) at PC=00004B82
  Trapped into debugger.
  RD0 00000676  RD1 00000000  RD2 00000001  RD3 00000001
  RD4 00000444  RD5 00000000  RD6 80001900  RD7 00000001
  RA0 0000E820  RA1 0003FF80  RA2 0000E83C  RA3 00020578
  RA4 00026350  RA5 0003FF4E  RA6 0003FF8A  ISP 0000FAB0
   PC 0000A158  USP 0003FF4E  ISP 0000FAB0  MSP FDFF742D  SR 2704
  VBR 00000000 ICCR 00000001 ICAR DB359DB2 XSFC 7 XDFC 1

Crash error 066D is listed on pdf page 4 of IOC Schematics as "unimplemented" but from our disassembly of KERNEL_0.M200, it looks like a catch-all "Somethings horribly wrong with disk-I/O".

Unfortunately JMP instructions are used to get to the error-emitting code, so no trace is left on the stack of which particular of the about a dozen different conditions caused it.

Patching those JMP's to JSR in KERNEL_0.M200 and trying again is now on our TODO list.

The fact that DISKX.M200 fails almost instantly, despite doing only single-sector transfers, hint that the problem, whatever it is, has to do with the intensity of DMA transfers more than the actual length of the individual transfers, which might indicate some kind of thermal issue.

Based on nothing but Mike Druke remembering that Signetics 74F373 caused some grief back in the days, we have previously replaced one such which is very prominent in the UNIBUS::PB path (IOREG12 @ K20), but that did not magically fix the problem.

The latch signal for that 74F373 is generated in the top right corner of IOC schematic page 25 (= pdf page 35), by IOFF0A, a 74F74, and the schematics have a handwritten correction which inverts that signal.

There is quite a gap in ECO levels between the working and non-working IOC boards, so we are also going to do a very detailed visual audit, to see if we can spot any differences, and in particular look for this one, since it is goes to the heart of the trouble we see.

2023…2012

2023 Logbook entries - Working to get the second R1000 running and still working on emulation
2022 Logbook entries - We are getting somewhere with the emulation
2021 Logbook entries - We discover there is no instruction set, and start simulating hardware
2020 Logbook entries - Covid-19 happens, and we start to create a software emulation
2019 Logbook entries - We fix the IOC RAM error, and Pierre-Alain Muller drops by for startup

2014-2018 - The project got stuck for lack of a sufficiently beefy 5V power-supply, and then phk disappeared while he built a new house.

2013 Logbook entries - Documentation preservation and we run into the IOC RAM error
2012 Logbook entries - Preservation starts and we borrow tapes from Pierre-Alain Muller

Many thanks to

Erlo Haugen
Grady Booch
Grek Bek
Pierre-Alain Muller
Michael Druke
Pascal Leroy