⟦afaf37b48⟧

WangText

<…05…;…0e…;…86…1

  …02…   …02…
  …02…   …02…
















CHAPTER
7































          Page
          #

   DOCUMENT
III
    TECHNICAL
PROPOSAL

    Apr.
29, 1982

         LIST OF CONTENTS

7.       RMA

7.1      Introduction
7.2      RMA Analysis
7.3      Reliability Models and Block Diagrams
7.3.1    Reliability Models for PU's

7.4      Availability of a Single Node
7.5      Avaiability Access to destination
7.5.1    Availability Access to destination including alternative
         rouling
7.5.2    Availability End User to VIA Host

7.6      EMH availability
7.7      NMH availability…86…1         …02…   …02…   …02…   …02…

7        R̲E̲L̲I̲A̲B̲I̲L̲I̲T̲Y̲,̲M̲A̲I̲N̲T̲A̲I̲N̲A̲B̲I̲L̲I̲T̲Y̲ ̲A̲N̲D̲ ̲A̲V̲A̲I̲L̲A̲B̲I̲L̲I̲T̲Y̲ ̲A̲N̲A̲L̲Y̲S̲I̲S̲


         This chapter provides the detailed analysis of the
         reliability and maintainability provided by the proposed
         equipment. Emhasis has been given to include the analysis
         for the range covered by the proposed system architecture.
         Furthermore, detailed information with respect to failure
         rates and repair times is provided for the various
         components and modules included in he architecture.

7.1      I̲N̲T̲R̲O̲D̲U̲C̲T̲I̲O̲N̲

         The availablity of the proposed equipment is very high
         due not only to a  high reliability of individual system
         elments, but not least due to the chosen CR80 computer
         configuration, where functional like elemens automatically
         substitute each other in case of failure. Overall system
         availability has been calculated.

         The high system availability has been achieved by use
         of highly reliable modules, redundant processor units
         and line termination units, andautomatic reconfiguration
         facilities. Care has been taken to ensure that single
         point errors do not cause total system failure.

         The reliability criteria imposed on the computer systems
         have been evaluated and the proposed hardware/software
         operatonal system analysed to determine the degree
         of availability and data integrity provided. In this
         chapter reliability is stated in numerical terms and
         the detailed predictions derived from mathematical
         models presented.

         The availability predictios are made in accordance
         with system reliability models and block diagrams corresponding
         to the proposed configuration. This procedure involves
         the use of module level and processor unit level failure
         rates, or MTBF, (mean time between failure) andrepair
         times or MTTR, (mean time to repair); these factors
         are used in conjunction with a realistic modelling
         of the configuration to arrive at system level MTBF
         and availability.

         Tabulated results of the analysis are presented including
         the relibilty factors: system MTBF and repair time
         MTTR.…86…1          …02…   …02…   …02…   …02…

         The basic elements of the proposed system architecture
         are constituted by standard CR80 units. Reliabilty
         and maintainability engineering was a significant factor
         in guiding the deelopment of the CR80.

         The CR80 architecture is designed with a capability
         to achieve a highly reliable computer system in a cost-effective
         way. It provides a reliable set of services to the
         users of the system, because it may be customised to
         theactual availability requirements. The CR80 fault
         tolerant computers are designed to avoid single point
         errors of all critical system elements by provision
         of redundancy paths, processor capabilities and power
         supplies.

         The architecture reflects te fact that the reliability
         of peripheral devices is lower than that of the associated
         CR80 device controllers. This applies equally well
         to communication lines where modems are used as part
         of the transmission media. Thus, the peripheral devices,
         odems, communication lines, etc., impact the system
         availability much more than the corresponding device
         controllers.

         To assure this very highly reliable product, several
         criteria were also introduced on the module level:

             An extensive use of h-rel, mil-spec components,
             ICs are tested to the requirements of MIL-STD 883
             level B or similar.

             All hardware is designed in accordance with the
             general CR80 H/W design principles. These include
             derating specifications, which greatly enhance
             thereliability and reduce the sensibility to parameter
             variations.

             Critical modules feature a Built-In(BIT) capability
             as well as a display of the main states of the
             internal process by Light Emitting Diodes on the
             module front plate. This greatly mproves module
             maintainability, as it provides debug and trouble
             shooting methods, which reduce the repair time.

             A high quality production line, which includes
             high quality soldering, inspection, burn-in and
             an extensive automatic functional tes.

7.2       R̲M̲A̲ ̲A̲n̲a̲l̲y̲s̲i̲s̲

          This section provides information with respect to RMA
          analysis of a system.  It includes the detailed formulas
          which apply as part of the RMA calculations.

          The MA analysis of a system provides information on
          how much of the time the system provides a given set
          of required functional capabilities, i.e. provides
          operative availability.  It shows how many times the
          system is not operative during a given perid and for
          how long.  A system may be operative even with one
          or more elements of the total system down or taken
          off-line for the purpose of repairing and/or replacement
          of delect modules/units.  Note that this is operative
          as seen by a user of the unctional capabilities, not
          as seen by maintenance personnel.

          The basis for determining the system level availability
          is an RMA model of serial and parallel system elements.
           Each of these elements defines a specific subset of
          the total system wih a well defined state either functioning
          or not.  Serial elements refer to elements all of which
          have to be available for that set to be available.

          Parallel elements describes those sets where not all
          elements need to be available, the number deermined
          by the required service level or the redundancy provided.

          The subsequent section introduces the basic RMA building
          stones.

7.2.1     S̲e̲r̲i̲e̲s̲ ̲E̲l̲e̲m̲e̲n̲t̲

          The mean time between failures of a series of n different
          RMA elements is made up as follos:

                MTBF         6
                    5 =  ̲ ̲1̲0̲ ̲ ̲ ̲ ̲

                         LAMBDA
                                5

         where the series failure rates is determined by
         the sum of the failure rates of the elements:

         LAMBDA…0f…5…0e… = LAMBDA…0f…1…0e… + LAMBDA…0f…2…0e…+...+LAMBDA…0f…i…0e…+....+LAMBDA…0f…n…0e…

         where LAMBDA…0f…i…0e… denotes the failure rate of the i'th
         element.

         The availability of a system of n different serial
         RMA elements is determined by:

         A = A…0f…1…0e…*A…0f…2…0e…*....*i*....*A…0f…n…0e…

7.2.2    P̲a̲r̲a̲l̲l̲e̲l̲ ̲E̲l̲e̲m̲e̲n̲t̲s̲

         When RMA elements are in parallel, it is required that
         one or more of the parallel units are operative simultaneously
         to obtain the required system performance.  The actual
         number of parallel units required is dependent o the
         actual models.  Assuming operational redundancy and
         neglible recovery time, the calculation rules are:

         a.  M̲e̲a̲n̲ ̲T̲i̲m̲e̲ ̲B̲e̲t̲w̲e̲e̲n̲ ̲F̲a̲i̲l̲u̲r̲e̲

             When the parallel elements have defined MTBF and
             MTTR values the following rules apply:

             1̲ ̲o̲f̲ ̲2̲ ̲e̲q̲u̲a̲l̲ ̲p̲a̲a̲l̲l̲e̲l̲ ̲e̲l̲e̲m̲e̲n̲t̲s̲ ̲

                                                        2
         Element MTBF      =      2̲ ̲*̲ ̲M̲T̲B̲F̲ ̲*̲ ̲M̲T̲T̲R̲ ̲+̲ ̲M̲T̲B̲F̲ , or

                     E                  2 x MTTR

                        2
        MTBF   =   ̲M̲T̲B̲F̲ ̲ ̲        Provided MTTR    MTBF

             E     2xMTTR


         n̲ ̲o̲f̲ ̲n̲+̲1̲ ̲E̲q̲u̲a̲l̲ ̲P̲a̲r̲a̲l̲l̲e̲l̲ ̲E̲l̲e̲m̲e̲n̲t̲s̲
                                                    2
          Element MTBF  =     (̲n̲+̲1̲)̲*̲M̲T̲B̲F̲*̲M̲T̲T̲R̲ ̲+̲ ̲M̲T̲B̲F̲ ̲,̲ or

                       E          n(+1)MTTR

                        2
          MTBF  =   ̲M̲T̲B̲F̲ ̲ ̲ ̲      provided (n + 1)*MTTR    MTBF

              E    n(n + 1)MTTR

         b.  M̲e̲a̲n̲ ̲T̲i̲m̲e̲ ̲T̲o̲ ̲R̲e̲p̲a̲i̲r̲

             The element mean time to repar, MTTR…0f…E…0e…, corresponds
             to the period where more than n out of the n+1
             units are not available i.e. the element is not
             fully operative.

         1̲ ̲o̲f̲ ̲2̲ ̲P̲a̲r̲a̲l̲l̲e̲l̲ ̲E̲l̲e̲m̲e̲n̲t̲s̲ ̲



            MTTR    =      M̲T̲T̲R̲

                 E           2

                                                                   …86…H         …02…   …02…
…02…   …02…
            n̲ ̲o̲f̲ ̲n̲ ̲+̲ ̲1̲ ̲P̲a̲r̲a̲l̲l̲e̲l̲ ̲E̲l̲e̲m̲e̲n̲t̲s̲

            MTTR     =     M̲T̲T̲R̲ ̲

                 E           2


         c.  A̲v̲a̲i̲l̲a̲b̲i̲l̲i̲t̲y̲

             The availability corresponds to the ratio between
             the MTBF and the total operative time, which is
             equal to the sumof MTBF and MTTR for the element
             thus:


                                      MTBF
             A    =             ̲ ̲ ̲ ̲ ̲ ̲ ̲ ̲ ̲ ̲E̲ ̲ ̲ ̲ ̲ ̲
                E
                                 MTBF   + MTTR
                                     E        E


7.3      R̲E̲L̲I̲A̲B̲I̲L̲I̲T̲Y̲ ̲M̲O̲D̲E̲L̲S̲ ̲A̲N̲D̲ ̲B̲L̲O̲C̲K̲ ̲ ̲D̲I̲A̲G̲R̲A̲M̲S̲

         The computer system is partitioned into system eements
         and the models used for reliability and availabilty
         predictions show how the proposed equipment provides
         the high degree of reliability required.

         The system reliability characteristics for the system
         are stated in numerical terms by mathemtical models;
         the supporting detailed predictions are presented in
         this chapter. The system models are partitioned into
         modular units and system elements that reflect the
         redundancy of the configuration; it accounts for all
         interconnections and swiching points. The MTBF and
         MTTR for the individual elements used in the calculations
         were obtained from experience with similar equipment
         on the NICS-TARE, FIKS and CAMPS programmes. The figures
         quoted on peripheral equipment are based on data suppied
         by the manufacturers.

         The equipment has been partitioned and functions apportioned
         so that system elements can have only two states -
         operable or failed. System elements are essentially
         stand-alone and free of chain failures.

         Careful attenton has been paid in the design to eliminate
         series risk elements. Redundant units are repairable
         without interruption of service. Maintenance and reconfiguration
         is possible without compromising system performance.

         The primary source selected for authenticated reliability
         data and predictions is the MIL-HDBK-217. The failure
         rate data are primarily obtained from experience from
         previous progammes and continuously revised as part
         of the maintenance programme on concurrent programmes.

         The relialibility models which apply to the proposed
         configurations are identified in the figures shown
         on the following pages.

7.3.1    R̲e̲l̲i̲a̲b̲i̲l̲i̲t̲y̲ ̲M̲o̲d̲e̲l̲s̲ ̲f̲o̲r̲ ̲P̲r̲o̲c̲e̲s̲s̲i̲n̲g̲ ̲E̲l̲e̲m̲e̲n̲t̲s̲ ̲

         The reliability models MTBF and availability predictions
         for the Processing Units are shown in the figure below:

         N̲o̲d̲a̲l̲ ̲S̲w̲i̲t̲c̲h̲ ̲r̲o̲c̲e̲s̲s̲o̲r̲ ̲U̲n̲i̲t̲ ̲(̲N̲S̲P̲)̲

                       MTBF  =  1305 Hours
                       MTTR  =  30 min.
                       Avail =  44.962%
                             =  766
                       Fig. III 7..1.1…86…1          …02…   …02…   …02…   …02…

         N̲o̲d̲a̲l̲ ̲C̲o̲n̲t̲r̲o̲l̲ ̲P̲r̲o̲c̲e̲s̲s̲o̲r̲ ̲(̲N̲C̲P̲)̲

         The reliability model for the processing part of the
         Nodal Control is shown below

         N̲o̲d̲a̲l̲ ̲C̲o̲n̲t̲r̲o̲l̲ ̲P̲r̲o̲c̲e̲s̲s̲o̲r̲ ̲(̲N̲C̲P̲)̲

                       MTBF  =  1161 Hours
                       MTTR  =  30 min.
                       Avail =  99.957%
                             =  861
                       Fig. III 7.2.1.2…86…1          …02…   …02…   …02…   …02…


         Network Management Processor (NMP)

                       MTBF  =  1241 Hours
                       MTTR  =  30 min
                      Avail =  99.960%
                             =  806
                       Fig. III 7.2.1.3…86…1          …02…   …02…   …02…   …02…


         E̲l̲e̲c̲t̲r̲o̲n̲i̲c̲ ̲M̲a̲i̲l̲ ̲P̲r̲o̲c̲e̲s̲s̲o̲r̲

                       MTBF  =  1453 Hours
                       MTTR  =  30 min.
                       Avail   99.995%
                             =  688
                       Fig. III 7.2.1.4…86…1          …02…   …02…   …02…   …02…

7.4      A̲v̲a̲i̲l̲a̲b̲i̲l̲i̲t̲y̲ ̲o̲f̲ ̲a̲ ̲S̲i̲n̲g̲l̲e̲ ̲N̲o̲d̲e̲

         Shown below is the availability model for a single
         node, which includes the dual NCC.

         The following Criteria are used in the calculations

         *   Te Nodal Switch LTU's are partioned in groups of
             36 LTU's of which only 1 may have failed.

         *   A nodal switch processer will still work, even
             if the V24 connection to the Nodal Control Processor
             does not work.

   NCP     CU…0f…CP…0e…     NSP     Nodal    Nodal     Nodal     Nodal
  1 of 2             5 of 6   LTU       LTU       LTU
LTU
                             grp. 1    grp. 2    grp. 3    grp.
4

   0.74      1.59    8.81      5.2       5.2      5.2
  5.2

                     MTBF  =  31.308 Hours
                      MTTR  =  30 Min.
                      Avail =  99.9984
                      ==================…86…1          …02…   …02…   …02…
…02…
         A̲v̲a̲i̲l̲a̲b̲i̲l̲i̲t̲y̲ ̲f̲o̲r̲ ̲t̲h̲e̲ ̲N̲C̲C̲ ̲C̲U̲

                                DISK

                                CTRL            DISK

            CU                 54.4            250

            Crate

            ASS

            1.4                 DISK            DISK

                               CTRL

                                54.4            250





                      MTBF =  630.915 Hours
                   MTTR =  30 Min.
                        =  1.59

         A̲v̲a̲i̲l̲a̲b̲i̲l̲i̲t̲y̲ ̲f̲o̲r̲ ̲N̲S̲P̲ ̲ ̲L̲T̲U̲ ̲G̲r̲o̲u̲p̲





            CU         CU         CU        LTU      LIA-N

            Crate      Crate      Crate

            Assy.      Assy.      Assy.



            1.4        1.4        1.4       36.4     0.1


                                               35 of 36

                                                  1

                     MTBF  =  192.308 Hours
                      MTTR  =  15 Min.
                            =  5.20…86…1          …02…   …02…   …02…   …02…

7.5      A̲v̲a̲i̲l̲a̲b̲i̲l̲i̲t̲y̲ ̲A̲c̲c̲e̲s̲s̲ ̲t̲o̲ ̲d̲e̲s̲t̲i̲n̲a̲t̲i̲o̲n̲

         Shown below is the reliability model for access point
         to access point in the primary path.

                      MTBF  =  9.777 Hours
                    MTTR  =  30 Min.
                    Avail =  99.9949%

7.5.1     A̲v̲a̲i̲l̲a̲b̲i̲l̲i̲t̲y̲ ̲a̲c̲c̲e̲s̲s̲ ̲t̲o̲ ̲d̲e̲s̲t̲i̲n̲a̲t̲i̲o̲n̲,̲ ̲i̲n̲c̲l̲u̲d̲i̲n̲g̲ ̲a̲l̲t̲e̲r̲n̲a̲t̲i̲v̲e̲
          ̲r̲o̲u̲t̲i̲n̲g̲ ̲

          The availability of access to access point to destination
          point is not improved by use of alternative routng.

          This is due to the small network, i.e. the two access
          nodes and the two access LTU's still have to work,
          and nearly all the unavailability is associated with
          these four components.…86…1          …02…   …02…   …02…   …02…

7.5.2     A̲v̲a̲i̲l̲a̲b̲i̲l̲i̲t̲y̲ ̲E̲n̲d̲ ̲U̲s̲e̲r̲ ̲t̲o̲ ̲V̲I̲A̲ ̲H̲o̲s̲t̲

          The reliability model for an end User's access to the
          VIA Host in shown below

                      MTBF =  44.803 Hours
                    MTTR  =  30 Min.
                    Avail =  99.9989%

          Legend                     Non CR,
                                     estimated

          *)  Note that the availability is calculated
              ???????…86…1          …02…   …02…   …02…   …02…

7.6       E̲l̲e̲c̲t̲r̲o̲n̲i̲c̲ ̲M̲a̲i̲l̲ ̲H̲o̲s̲t̲ ̲A̲v̲a̲i̲l̲a̲b̲i̲l̲i̲t̲y̲

          The Electronic Mail Host (EMH) availability model is
          shown in the figure below:

                      MTBF =  249.513 Hours
                      MTTR  =  30 Min.
                      A̲v̲a̲i̲l̲ ̲=̲ ̲ ̲9̲9̲.̲9̲9̲9̲8̲%̲…86…1          …02…   …02…   …02…   …02…

7.7       N̲e̲t̲w̲o̲r̲k̲ ̲M̲a̲n̲a̲g̲e̲m̲e̲n̲t̲ ̲H̲o̲s̲t̲ ̲A̲v̲a̲i̲l̲a̲b̲i̲l̲i̲t̲y̲

          Shown below is the availability model for the Network
          Management Host:

                        MTBF  = 1129 Hour
                        MTTR  =  30 Min.
                      Avail =  99.9557%…86…1          …02…   …02…   …02…   …02…

7.8       E̲Q̲U̲I̲P̲M̲E̲N̲T̲ ̲M̲E̲A̲N̲ ̲T̲I̲M̲E̲ ̲B̲E̲T̲W̲E̲E̲N̲ ̲F̲A̲I̲L̲U̲R̲E̲S̲(̲M̲T̲B̲F̲)̲

          The high reliability of the proposed equipment is achieved
          through use of proven failure rate equipment similar
          to that supplied by hristian Rovsing for the NICS-TARE,
          FIKS and CAMPS programmes.

          Early in the design phase, a major objective for each
          module is to achieve reliable performance. CR80 modules
          make extensive use of carefully chosen components;
          most of the IC's are tsted to the requirement of MIL-STD
          883 level B.

          The inverse of MTBF representing failure rate which
          applies to system elements and modules is listed in
          Table 7-8 entitled CR80 Reliability Factors.

          The MTBF data has been derived from reliabilitydata
          maintained on the NICS-TARE and CAMPS and similar programmes.
          Inherent MTBF values are in general derived from the
          reliability predictions accomplished in accordance
          with the U.S. MIL-HDBK-217 "Reliable Predictions of
          Electronic Equipment". Ths document, adopted by Christian
          Rovsing through their involvement with NICS-TARE, is
          used extensively on current military and aerospace
          programmes.

          Failure rate data for terminal and periphal equipment
          is generally provided by the vendor in accodance with
          the subcontract specifications.…86…1          …02…   …02…   …02…
          …02…

                      R & M VALUES FOR MODULES AND PERIPHALS
                      Table 7-8 (Cont'd)

DataMuseum.dk

CR80 Wang WCS documentation floppies

⟦afaf37b48⟧ Wang Wps File

Derivation

WangText