- Do we need to format the disks to 1024 byte sectors before we mount them in the R1000 ?
- Grek Bek: Should be able to boot off tape and initialize the disk, which will format it correctly. If it is a disk that came from Rational at some point it is probably already sectored correctly.
- What SCSI disks can we use, how many disks do we need, and how do we get R1000 to accept them ?
- Grek Bek: 1 is all you need. The first disk is partitioned into two parts: 1) DFS Diagnostic File System about 300MB 2) Environment - the rest
- Grek Bek: Due to Y2K issues and software licensing issues, you may need to set the clock back before 2000.
- Nico de Jong : I can recognize the problems you head reading 8200 tapes on a 8505. I have had the same experience. For some weird reason, some 8200 tapes can only be read on 8505 when then tape to be read is write-protected ! It is some time ago I had to do that, but I seem to remember that it was especially a problem with Fuji tapes
The Intermedia conversion system in the museum has the ability to dump tapes, including the recording of blocksizes etc.
After further tests we finally took the leap and grabbed the cutter and solder iron, and replaced the three suspected memory chips. Boot sequence with the original BIOS now gives:
R1000-400 IOC SELFTEST 1.3.2 512 KB memory ... [OK] Memory parity ... [OK] I/O bus control ... [OK] I/O bus map ... [OK] I/O bus map parity ... [OK] I/O bus transactions ... [OK] PIT ... [OK] Modem DUART channel ... [OK] Diagnostic DUART channel ... [OK] Clock / Calendar ... [OK] Checking for RESHA board -- Bench mode (ID 7) detected Skipping RESHA tests Local interrupts ... [OK] Illegal reference protection ... [OK] I/O bus parity ... [OK] I/O bus spurious interrupts ... [OK] Temperature sensors ... [OK] IOC diagnostic processor ... [OK] Power margining ... [OK] Clock margining ... [OK] Selftest passed Restarting R1000-400S January 1st, 1901 at 00:03:56 Logical tape drive 0 is an 8mm cartridge tape drive. Logical tape drive 1 is declared non-existent. Logical tape drive 2 is declared non-existent. Logical tape drive 3 is declared non-existent. Booting I/O Processor with Bootstrap version 0.4 Boot from (Tn or Dn) [D0] :
Next step: Mount the board into its rightful place and see how far it gets now...
Decided to make further measurements with the oscilloscope in order to rule out other causes. Probing all pins on a good and a bad RAM chip did not reveal anything.
Tried to piggyback H11 with a new RAM-chip, and the questionable bit went mid between VCC and GND. It could be that the good chip tried to pull down while the bad chip pulled up. Double-checked chip-select on H3, the only other identified chip that may drive the same data line (D30), and chip-select is completely passive during the troublesome period.
Previous tests went between [0x00000000..0x00040000[ and [0x00040000..0x00080000[. Now tried to run the tests between [0x00001000..0x00021000[ and [0x00041000..0x00061000[ as well as between [0x00001200..0x00021200[ and [0x00041200..0x00061200[ to test whether run-length could be a factor. The exact same failing addresses indicate run-length is not a factor.
Tried a reverse scan, Set(addr), Clr(addr), Set(addr) then Get(addr-4). The same data bits are affected in both banks, but the failing addresses are not identical. Some patterns are similar, and some other patterns appears.
Tried another test with: Set(addr), Clr(addr), Set(addr), Get(addr+4), branch-test delay, then another Get(addr+4) - The second Get reads out correctly. this indicates that the chip does have the right value, but that it is incorrectly read out in some circumstances.
Previous software tests indicated problems with certain bits at some address patterns: Bits 7 and 23 in the low bank and bit 30 in the high bank showed issues. The fault manifests itself at some addresses when the following memory accesses are done in quick succession: Set(addr), Clr(addr), Set(addr) then Get(addr+4) - The Get(addr+4) returns incorrect values only on these bits.
Today all RAM chips were checked with oscilloscope to verify and possibly identify the problem. H11, G10 and G41 showed different behavior on the oscilloscope, and these chips happens to map to the exact bits identified at the software test.
H40 did show a little flickering on the DC levels, but the flanks seemed OK.
The above input to the RAM-chips looks like this:
The output of a healthy chip looks like this:
The output of the sick chips looks like this:
Next step will be to replace them. We have replacements ready, so stay tuned...
Tried to patch the EEPROM in various ways, and learned a lot more.
If we skip the offending memory check, (and the EEPROM checksum because we're lazy) we get all the way to the boot device prompt (tape/disk).
We got two-way serial connection to the console port: TX is TTL level, RX is RS-232 level.
In trivial homebrew tests, the RAM does not fail, but what we call "ramtest_5" repeatedly does.
Big discovery of the night: The two top address bits of the EEPROM are swapped on the PCB, so the middle two quarters are swapped in the image we try to reverse-engineer. After fixing that, the contents make a lot more sense.
Managed to power up the IOC board stand-alone. 5V @ 35A required (30A didn't seem to be quite enough).
- 5V @ 35A connected to 3 Capacitors at edge (to distribute load).
- Reset (GB113) to Ground (page 23 of R1000_SCHEM_IOC.pdf).
- CTS# (GB055) to 5V (page 25)
- Power On
- Release Reset (GB113)
TTL Serial output read from CPDRV0 @N1 (pin 2 or 3).
As expected output is still:
R1000-400 IOC SELFTEST 1.3.2 512 KB memory ... * * * * * * * FAILED
EEPROM 28256 is not compatible with EPROM 27256!
More email exchanges with Greg Bek and Michael Druke, both ex-Rational people.
Our IOC board does actually have 72 64K*1 DRAM chips, four banks of 18 chips.
Read the EEPROM from the IOC board, and have started disassembling the self-test code, to find out what it actually does.
Powered up with only IOC and RESHA boards seated, same result, which at least means that we can test in that config.
Also tried powering the IOC board up out of the cage, but our +5V supply can only deliver 7 Ampere, and that's not enough.
Greg Bek replied to email and told me I was looking at the wrong 68k processor: The daughterboard on the RESHA is just for TCP/IP.
The failing RAM test is the 68k on the IOC board.
Opened the cardcage and where this label greets you:
Visual inspection of the IOC revealed nothing of notice.
Changed the Lithium battery which is used for the RTC.
Gently reseated all socketed chips on the IOC.
Gently reseated all the other cards in the card cage.
Still fails RAM test.
The IOC Schematic manual mentions a low-level debugger, but it seems to require RAM test to pass before you can enter it.
ID-information on the IOC:
Our workshop area has been rearranged, the GIER computer moved and I could finally uncrate the R1000s400.
After a visual inspection, I did "the smoke-test" and can happily report that the smoke stayed inside.
Unfortunately, we don't get very far in the boot sequence:
R1000-400 IOC SELFTEST 1.3.2 512 KB memory ... * * * * * * * FAILED
This is 512KB DRAM memory on a small daughterboard on a CMC ENP100 VME board mounted on the RESHA board:
I have tried various obvious remedies, reseating connectors etc, but to no avail.
A few small details have to fall into place before I can start to power up the machine: Our new workshop area is done now, and the last missing bit is the rearranging the power-circuits.
In the meantime scanning of documentation progresses about a binder per week or so, and should be complete before summer.
Scanned in the "Guru Course" material while hosting COMAL kickoff meeting. Sheet-feeding scanners are a good thing.
I have started a new wiki page for the documentation: Rational/R1000s400/Documentation
The 8200 Ole brought was from an IBM RS/6000, and either because of special firmware/settings or fault on that drive, it refused to do anything more advanced than a SCSI Inquiry and Test-Unit-Ready.
In the end I wrote a set of tapes on the known-suspect 8200 drive, and test-read them on the known-good 8505 drive with success.
I have now, finally, convinced myself that I have written good copies of Pierre-Alains tapes, and more importantly, that I can do so again, if the Exabyte drive in the R1000 cannot read them for reasons of alignment etc.
Received email from Grek Bek with answers to a lot of my questions, so now next thing on the program is to find a corner with power and space to set up the machine. This probably involves moving other stuff out of the way, so it may take some weeks.
Read Pierre-Alains original tapes on the EXB-8505 drive, and got the exact same result as the read on the EXB-8200 drive, so now the originals can go back to Pierre-Alain.
Failed utterly to make 8200 read a tape written on the 8505.
Ole has promised to bring in a known good 8200 drive, that will be next attempt.
Spent tonight trying to write copies of the tapes I read last thursday.
The Exabyte 8200 drive I have used has clearly shown itself to be faulty, or at the very least flakey.
Tried using a 8505 drive instead, but worry that it may have used compression while writing the tapes.
1. Re-read Pierre-Alains tapes with 8505 to check that on-disk copy is a good read. I have no reason to doubt this, based on my analysis, but I want to be 100% sure before I return the tapes to Pierre-Alain.
2. Try to read the tapes I wrote tonight with flakey 8200 drive. If it can read (some of) them, they are not compressed.
Tonight I set up a FreeBSD computer with an ExaByte 8200 tapedrive, and used it to read the three tapes Pierre-Alain Muller mailed me last month.
The tapes have a rather complex block structure, and I have read them in using a program which also stores the information about block-sizes, tape-marks etc, so that it should be possible to write exact copies of the tapes to install from.
The result are three files:
-rw-r--r-- 1 phk wheel 41837637 Aug 2 18:19 20120802_R1000_CATALOG -rw-r--r-- 1 phk wheel 13141316 Aug 2 18:20 20120802_R1000_DFS -rw-r--r-- 1 phk wheel 78362000 Aug 2 18:23 20120802_R1000_ENVIRONMENT
MD5 (20120802_R1000_CATALOG) = 6102fb4c10fa580af4fb0e508cfd4127 MD5 (20120802_R1000_DFS) = 5e61879f066484cacdd1e895cea65b41 MD5 (20120802_R1000_ENVIRONMENT) = dce495e2b575e8aeae3da79e9fce8074
Many thanks to
- Erlo Haugen
- Grady Booch
- Grek Bek
- Pierre-Alain Muller
- Pascal Leroy