Rational/R1000s400/Logbook

Fra DDHFwiki
Spring til navigation Spring til søgning


2026-01-03 New Year's code coverage testing

We spent some time over new-years compiling and running the emulator with code coverage testing, mostly as an experiment, but also being curious if the Rational Environment "used the entire bow", so to speak.

The emulator was run first through the udiag, and then booting the environment with send/expect scripting, until the first snapshot, and subsequent login to the operators console, a tad more than 2000 seconds of emulated time.

The top-line result is that 94.14% of the code lines in r1000_arch.c did get to do their job, leaving just 177.

... approximately: The counts are not 100% precise, for instance this makes no sense:

  860   16         if (CMDS(CMD_IDL))
  861   16                 return (true);
  862
  863   16         assert(0);
  864    0 }

In additions to 32 asserts, there are a number of conditions marked "SPARE", and a few other undefined corner case situations, for instance the unused fourth selection for multiplier output, which are not expected to execute, and which did not

Of actual substance, manual inspection finds:

  • 4 unused fiu-conditions
  • No scavenger traps
  • 4 unused macro-events
  • 6 unused seq-conditions
  • seq random "Clear Stack" is unused.
  • one unused typ-condition
  • 5 unused ioc-conditions (Two related to ECC)
  • 4 unused ioc randoms (Two memory read&write, two front panel LED control)
  • 1 line checking the front panel key-switch position.
  • A few "AND" logical conditions where the last terms are not evaluated.

We may be able to pare that list down further, for instance if some of the unused randoms and conditions are never issued by the microcode.

But all in all, we now know that we can script a trivial test, which gets into almost all corners.

2025-12-22 A satisfying simplification

The "Resolve Lex Valid" register is a 16 bit register which knows which lex-levels are valid right now, and the sequencer microcode can update it three different ways via the seq.random field:

   0x0 Invalidate ("clear") this lex level
   0x1 Validate ("set") this lex level
   0x4 Invalidate all lex levels above this
   0x7 Invalidate all lex levels

The hardware implements this on schematic page 17 of the Sequencer, with a slightly daunting bit of ingenuity:

Note the 74F182 Carry Look Ahead Adder Generator, named "VBCLA". Even though the circuit designer has helpfully documented the logical equations next to it, and we could sort of see where this was going, we had left the emulator to faithfully do what the hardware on page 17 did: Figure out the address bits, and use whatever the contents of the four PA041 PROM images came up with.

So it felt quite satisfying to finally replace that with source code which clearly show what is actually going on:

  static void
  seq_nxt_lex_valid(void)
  {
 
          switch((r1k->seq_rndx >> 5) & 0x7) { // SEQ microarch pdf pg 33
          case 0:      // Clear Lex Level
                  r1k->seq_lex_valid &= ~(1 << (15 - r1k->seq_resolve_address));
                  break;
          case 1: // Set Lex Level
                  r1k->seq_lex_valid |= (1 << (15 - r1k->seq_resolve_address));
                  break;
          case 4: // Clear Greater Than Lex Level
                  r1k->seq_lex_valid &= ~(0xfffe << (15 - r1k->seq_resolve_address));
                  break;
          case 7: // Clear all Lex Levels
                  r1k->seq_lex_valid = 0;
                  break;
          default:
                  break;
          }
  }

The new code is also a fraction of a percent faster.

2025-11-27 Disassembling the R1000 Microcode

We have revisited the R1000 Microcode disassembler, now that we can study what the emulator actually does it becomes easier to figure out what the bits mean, and that helped us spot a missing "inverter" our decoding of the microcode files, which caused the TYP+VAL frame number field to be one-off.

With that and other tweaks, the output is starting to make more sense, but we are very far from anything near the original source code, as can be seen in this example:

  2fec ; --------------------------------------------------------------------------------------
  2fec ; 0x0273        Execute Discrete,Plus
  2fec ; --------------------------------------------------------------------------------------
  2fec         MACRO_Execute_Discrete,Plus:
  2fec         PLUS_OP:
  2fec 2fec                                            ; -- TITLE DISCRETE_EXECUTE - PLUS_OP
                                   ; OPCODE (NAME        => "EXECUTE,DISCRETE,PLUS",
                                   ;         LABEL       => PLUS_OP,
                                   ;         NEEDS_VALID => 2,
                                   ;         NEEDS_FREE  => 0)
                                   ; CHECK_CLASS ([TOS-1] = [TOS] = OF_KIND.DISCRETE_VAR),
                                   ; TYP { [TOS-1] := LITERAL_DISCRETE },
                                   ; VAL { [TOS-1] := [TOS-1] + [TOS] },
                                   ; POP_CONTROL_STACK,
                                   ; IF NOT VAL(OVERFLOW) THEN USUALLY DISPATCH,
           dispatch_csa_valid      2 None
           dispatch_cur_class      8 None
           dispatch_ignore         1 None
           dispatch_uadr        2fec None
           fiu_mem_start           2 start-rd
           ioc_adrbs               3 seq
           seq_b_timing            3 Late Condition, Hint False
           seq_br_type             d Dispatch False
           seq_branch_adr       2fed 0x2fed
           seq_cond_sel           09 VAL.ALU_OVERFLOW(late)
           seq_random             04 ?
           typ_a_adr              1f TOP - 1
           typ_b_adr              10 TOP
           typ_c_adr              20 TOP - 0x1
           typ_c_mux_sel           0 ALU
           typ_csa_cntl            3 POP_CSA
           typ_frame               0 None
           typ_mar_cntl            e LOAD_MAR_CONTROL
           typ_rand                8 SPARE_0x08
           val_a_adr              1f TOP - 1
           val_alu_func            1 A_PLUS_B
           val_b_adr              10 TOP
           val_c_adr              20 TOP - 0x1
           val_c_mux_sel           2 ALU
           val_frame               0 None

The Knowledge Transfer Manual contains a couple of snippets from the original microcode source files from pdf page 17 to 22, and we have decorated the snippet above with the relevant part.

We have added the microcode disassembler to the AutoArchaeologist, so we can start to study at the eleven M200-M400 microcode files in the DFS image.

2025-10-20 A lot wiser on tape problem

That turned out to be spot on!

We had been a little bit too quick with the copy&paste and instead of merely signalling unit-attention on short tape reads, we failed the SCSI transaction with a timeout.

With that fixed, restore from backup tapes work, and the R1000 CPU emulation code is no longer treated as suspect.

2025-10-18 No wiser on tape problem

We have been running some of the older (and much slower) Emulator versions to try to learn more about the tape-restore issue, and we cannot really claim we have learned much: We keep getting the same "Unexpected Exception".

Along the way we have spotted one detail, which will be our focus going forward: The last block read from tape before the exception, is the first and only partial tape read operation.

The R1000 asks for 0xc00 bytes, the tape block is either 0x400 or 0x800 bytes, depending which of the two backup tapes we test with.

We have implemented the emulated SCSI tape device based on the ExaByte 8200 tape drive manual, but far be it form us to claim we got it right.

The fact that SCSI tape-drive block-size unexpectedness is signaled with a bit named "SILI" tells you everything you need to know about how well-designed that part of the SCSI standard was.

2025-08-06 Tapes are fine, emulator maybe not ?

Trying to restore Environment backup tapes is not going well:

  ====>> Recovery <<====
  Positioning tape to Backup Index
  Processing Backup Index
     Processing Tape File: Vol Info
     Processing Tape File: VP Info
     Processing Tape File: DB Backups
     Processing Tape File: DB Processors
     Processing Tape File: DB Disk Volumes
     Processing Tape File: DB Tape Volumes
  Positioning tape to Backup Data
  Processing Backup Data
     Processing Tape File: Space Info Vol 1
     Processing Tape File: Block Info Vol 1
     Processing Tape File: Block Data Vol 1
     Processing Tape File: Space Info Vol 2
     Processing Tape File: Block Info Vol 2
     Processing Tape File: Block Data Vol 2
  
  *** Unexpected Exception:  <Exception: Unit = 914164, Ord = 14>, from PC = #914008, #128D
       Detected by Routine:  Recover_Save_Set
  *** Recovery was not successful ***

As far as we can tell from tracing, the emulated tape drive does the right thing and this exception happens after the tape reading has completed.

We know that it is possible to restore this backup tape, because we have done so previously, on the real hardware, in order to check that we did in fact have a good backup in our BitArchive.

We created a new backup tape, using the emulator, based on the slash-and-burn "snap08" disk image, and we got the same error.

We have changed a number of variables, for instance the snap08 image is only 250MB, so tried to restore that to a single disk system, still exact same error.

Since the hardware is stuck on a bad SRAM, next experiment is to launch release2 of the emulator on the "HW" schematics, and see what happens there.

First part of that was to update the schematics to KiCad9, but that was fortunately trivial.

2025-08-04 Now also emulating tapes

A couple of days have been spent getting the emulated SCSI tape to work better, and with the help of the Exabyte-8200 User's manual from bitsavers.org, we seem to have figured out how the SCSI 0x11 "SPACE" command should work.

At least work enough that the emulator is chugging along trying to read in the environment backup tape we created from the system on PAM's disks.

Our interest in the tape-drive is partly so we can explore the tapes PAM also donated, but primarily because we want to create a single-disk Environment to make the emulator a little bit less intimidating.

To do that we need to run the original disk image, delete enough stuff that it will fit on a single volume, write DFS and Environment backup tapes, and load them into an emulator which only has a single disk.

That is going to take some time.

But the emulator runs 30-40% faster than real hardware on 2% of the power, the emulated SCSI disks and tapes are nearly instantaneous, so not nearly as much time as on the real machine.

Even more importantly, we can write expect-send scripts for the emulator's console terminal, so we do not even have to be present while it chugs along.

2025-07-27 Making sense of things using the emulator

Having the emulator means we can now spelunk the R1000 system from the comfort of home, and as time and energy permits, we are starting to make sense of some things now.

All complex systems have their own vocabulary: Normal people "boot" their computers, but IBM "IPL" (=Initial Program Load) them, and I am sure other computer manufacturers have used other words.

When we started spelunking the data structures of the R1000, we had no access to the original Rational vocabulary for the lower levels of the system, and so resorted to stick likely words on the structures we dug out.

Being able to safely try things out in the emulator, provides us hints and glimpses of the official vocabulary, for instance by poking the kernel CLI interface:

  *Kernel: show_space_info
  VPID [266]: 8
  KIND [MODULE]: INSTRUCTION
  SEGMENT [2460516]: 4289
  SNAPSHOT_NUMBER [63969]: 1
  THE_SPACE        => ( 8,INSTRUCTION, 4289)
  COMMIT_TIME      => 1
  FAMILY           => (UP => 7, PROC => 0, SEQ => 779)
  PERMANENT        => TRUE
  IMPERMANENT      => FALSE
  VER_CONTROL      => COMMIT_BY_REQUEST
  GENERATION       => 2
  COMMITTED        => TRUE
  PAGE_COUNT       => 1
  DELETED          => FALSE
  USER_DATA        => 1: unknown
  OBJECT           => Manager 0Instance 1

(The missing space between '0' and 'Instance' on the last line is a genuine R1000 kernel bug)

Here we learn that what we called a "segment" is actually a "space".

In retrospect, we should have caught a clue sooner, because the backup tape has a file for each disk, named "Space Info Vol X" in the ANSI tape labels.

Poking the kernel CLI further, by we learn that they come in at least three fundamental types: "INSTRUCTION", "MODULE" and "IMPORT", and we can even locate the kernel code which produces that output in a code segment in the DFS filesystem.

And as luck would have it, a typing mistake revealed more useful information:

  *Kernel: show_space_info
  VPID [4]: 4
  KIND [IMPORT]: IMPORT
  SEGMENT [256302]: 4450606
  EXPECTED VALUES ARE:  0 ..  4194303

All computer geeks will recognize that number as 2^22-1, so the identity of a space consists of 10 bit VPID, 2 bit KIND and 22 bit SEGMENT.

We also spotted a number of different values in the USER_DATA field, and were able to locate those strings to a kernel code segment in the DFS filesystem, and from that derive the complete list.

Here it transpired that the 0xe3 value which we had assumed was "ada source code" is actually "Image: Permanent editor buffers", which explains why there are so (relatively) few of them.

But we also run into new mysteries: The "Space Info" files on the backup tape does not contain the actual snapshot number, called COMMIT_TIME by the kernel CLI, but it has a number which somehow translates to the snapshot number, at least for some of the spaces.

2025-05-28 And now for something entirely different

...A R1000 instruction simulator.

The disk image contain a code segment which the AutoArchaeologist names ⟦2fa0095f7⟧.

It is blatantly obvious that this object contains a disassembler for the R1000 instruction set.

When Allan made his amazing reverse engineering of the un(der)documented R1000 instruction set, this was one of the segments he exploited.

It is not a very big segment, and it uses a quite small subset of the R1000 instruction set, so instead of staring through the train window while long stretches of Scandinavia was getting rained at, we tried to cobble together a very crude python program which could execute those instructions.

It took a lot of guesswork, and a few ugly hacks, but it was possible and took less than 850 lines of python.

We do not trust all aspects of the output, but it did reveal a handful of typos and similar trivial mistakes in our current instruction list.

Between this new source, and the INSTRUCTIONS_SPEC file we found in the EEDB file-system, we now have two very credible sources which agree about almost all instructions, and we have started to revise the disassembler in the PyReveng3 project accordingly.

The discrepancies between the two sources show how the instruction set continued to develop in lock-step with the Ada-compiler's code generator, and we still cannot rule out, that the final version, as represented by the disk-images, use instructions our disassembler still have not learned about.

2025-05-17 Decoding the "pure" Rosetta Stones

Spring is a busy time, and not much is happening with the HW and the emulator.

But we have been spelunking the "archive_on_shutdown" diskimages we created long time ago, and this has been very productive.

Permanent objects in the R1000 system are controlled by "Manager daemons" and these store their internal state as objects on the disk.

A special facility controlled by Operator.Archive_On_Shutdown(bool) causes the manager daemons to save their state in a serialized form where many items are in string form. (See Guru Course 01 pg. 138)

On next boot, this string form of the state is read and saved into the regular form, which might now be different if a software update was applied.

The serialized state appears as '*.pure' files in the EEDB filesystem, and are quite a bit easier to figure out than the regular state segments (named '*.state')

At the end of the '*.pure' segments, after the actual state, there is a "schema" which we found out describes the regular '*.state' file layout, and while we have not nailed down all details of either of the formats, we have gotten a lot further.

We have now managed to manually trace our way from a filename, to Directory object, to a File object and from there to the segmented heap which contained the Rational copyright message.

Next task is to make the AutoArchaeologist present all this new information sensibly.

2025-03-11 Press ESC-ESC-O-H for Enclosing-in-place

We tried booting the original disk-images, and repeated the experiment from last week, but without a Facit Twist it is a bit cumbersome, since a modern machine doesn't have nearly as many keys on the keyboard, and in particular not laptops.

So we have created a cheat-sheet and type the necessary escape-sequences by hand, ESC-ESC-O-H for Enclosing-in-place, ESC-[-K for WINDOW and so on.

The short and the long is that we got the same constraint error, until we noticed that the files we were trying to edit where listed with an "A" which means "Archived".

Once we opened both the 'SPEC and 'BODY for editing, then we could promoted them both to installed with no problem.

So maybe that Constraint Error is simply all the Environment has to say to somebody who tries to do something utterly crazy ?

We'll find out next time we get the hardware running.

But we have been thinking, and this made us think about it again, that it would be neat to have a way to script scenarios and experiments in the emulator.

There are libraries, for instance libteken which can interpret the escape sequences the r1000 sends and render the screen image correctly. And sending the proper escape sequences is obviously not a problem either.

The hard part is how a "send-expect" style scripting language can work, and if the scripts end up being write-only or if one can actually figure out what they do by reading them.

For trivial text-screen-oriented applications, say the vi(1) or EMACS editors, it is a tangible problem, in the sense that their output is 100% predictable based on the key-presses they receive.

But the r1000 starts snapshots asynchronously, and it chooses which of the three window areas to use, with a predictable but quite complex algorithm based on the contents of the window stack.

It would be neat to have, but probably not to implement.

2025-03-06 So does it actually work ?

The wisdom of »First make it work, then make it faster« is not up for debate.

Yet in this project we had to do the precise opposite, because, as John Aynsley from Doulos warned us, before we even embarked on the SystemC emulation, we saw "kiloHertz not megaHertz".

Yesterday we hooked the Facit Twist in the museum to the emulator back home, through:

  • A 9600 serial line
  • A USB-serial adapter
  • A Raspberry Pi 3
  • A ssh connection across
  • …A wired ethernet
  • …A wireless access point
  • To a laptop with a flaky wireless connection
  • Another ssh connection
  • …Though a TINC VPN
  • …Across the same flaky wireless connection
  • …Through more layers of firewalls than reasonable
  • …100km through Danish ISPs
  • …To another firewall where the TINC VPN terminated
  • …Through a few meters of CAT7
  • To another machine
  • A Telnet session to the tcp socket on the machine running the emulator.

This was not optimal from a response-time point of view.

Instead we launched the emulator on our "exhibition server", a decade old Intel Sandy Bridge we are going to use to run "the remote end" of our "modem-age" exhibit: BBS, ISP, dial-in service etc.

The server doing nothing else, managed to run the emulator a third of hardware speed, which was still usable.

Hoping to eventually speed up booting more, by not wasting time on the nonexistent ENP-100 network co-processor, we tried replacing the entire substantive contents of the !MACHINE.INITIALIZE_DTIA'BODY program with null;, but when we hit Promote we got:

 Promote failed - Unhandled exception: Constraint_Error (Variant), from PC = #46300E, #1435

We tried adding a null; to the one already in !MACHINE.INITIALIZE_MAIL'BODY, and got the exact same result when promoting.

We tried Operator.Enable_Privileges

Same result.

To be honest: We have no idea if that was even supposed to work, but we would at least expect a more "polished" error message, so we assume that something does not work as it should.

"Something" is almost certainly the emulator.

But just in case this is fall-out from the Genghis Khan-style "cleanup" we performed on this disk-image, we will repeat the experiment on the original disk-image.

It would have been nice to just fire up the real machine on the same disk-image and see what should have happened, but we need to catch up with the SRAM epidemic first.

2025-02-25 Well that escalated fast…

We now have a version of the R1000 emulator without SystemC and it runs at 73% of hardware speed, and boots to idle in 37 minutes.

We did totally did not expect that two days ago.

The R1000 part of the emulation runs so fast that the simulated IOP often panics. It looks like an interrupt priority/simultaneous interrupts problem, and we're testing fixes for that.

2025-02-23 Ready to ditch SystemC

The last few weeks we have turned the SystemC "signals" into C/C++ variables, and reduced the "sensitivity" of the "components" to just their clock signals:

This has made the emulator much faster: Booting to "idle" in 100 minutes is only five times slower than the real machine, and quite usable.

Better yet: We can get rid of SystemC now: We have a C++ class for each board, and we can just call their "clock" methods in sequence and we should be good.

Until now SystemC emulation of signals and propagation delays have totally dominated performance, but these last couple of weeks the C/C++ code started to matter too.

Once SystemC is gone, it all comes down to the C/C++ code, and making that run five times faster may be a tall order.


2023…2012

  • 2014-2018 - The project got stuck for lack of a sufficiently beefy 5V power-supply, and then phk disappeared while he built a new house.

Many thanks to

  • Erlo Haugen
  • Grady Booch
  • Grek Bek
  • Pierre-Alain Muller
  • Michael Druke
  • Pascal Leroy