|
DataMuseum.dkPresents historical artifacts from the history of: DKUUG/EUUG Conference tapes |
This is an automatic "excavation" of a thematic subset of
See our Wiki for more about DKUUG/EUUG Conference tapes Excavated with: AutoArchaeologist - Free & Open Source Software. |
top - metrics - downloadIndex: T b
Length: 33719 (0x83b7) Types: TextFile Names: »big.txt«
└─⟦db229ac7e⟧ Bits:30007240 EUUGD20: SSBA 1.2 / AFW Benchmarks └─⟦this⟧ »EUUGD20/AFUU-ssba1.21/ssba1.21E/musbus/Workload/big.txt« └─⟦this⟧ »EUUGD20/AFUU-ssba1.21/ssba1.21F/musbus/Workload/big.txt«
.\"########## .\"# Big troff text file for workload scripts .\"########## .nr Pw 6.75i .nr Po 0i .ll \n(Pwu .ev 2 .ll \n(Pwu .lt \n(Pwu .ev .po -\n(Po .hy 14 .nr II 0 .de SZ .ps \\$1 .vs \\$1u*1.25p .. .de NH .in 0 .if t .sp 0.5v .if n .sp .ne 6v .SZ 12 .ft B \\$1 \\$2 .br .ft P .SZ 10 .. .de LP .in \\n(IIu .if t .sp 0.35v .if n .sp .ne 3v .. .de IP .LP .in +4n .ta 4n .ti -4n \\$1 \\c .. .de PR \\fI\\$1\\fP\\$2 .. .de FL \\fB\\$1\\fP\\$2 .. .de SC .ti +6n .if t .HS $ \\$1 .br .if t .HE .. .de SV \\fI$\\$1\\fP\\$2 .. .de RS .in +6n .. .de RE .in -6n .. .de HS .ft H .ps -1 .. .de HE .ps +1 .ft P .. .de VA .IP "\fIShell Variable:\fP \\fB\\$1\\fP (default: \\$2)" .br .. .de TN .IP "\fITest Name:\fP \\fB\\$1\\fP" .br .. \& .sp 1i .de Fo 'Hd .. .de Hd .ev 2 .bp .in 0 .sp .tl 'MUSBUS Introduction''%.' .sp .ev .. .wh -0.1i Fo .ad c .SZ 14 .ft B An Introduction to the Monash Benchmark Suite (MUSBUS) .if t .sp 0.5v .if n .sp .SZ 12 .ft I Ken J. McDonell .if t .sp 0.5v .if n .sp .SZ 10 .ft R Department of Computer Science .br Monash University .br Clayton, AUSTRALIA 3168 .sp ACSnet: kenj@moncskermit.oz .br USENET: seismo!munnari!moncskermit.oz!kenj .br ARPA: kenj%moncskermit.oz@seismo.arpa .sp Revised: 24 June, 1987 .br .ad b .NH 1 Introduction .LP The Monash University Suite for Benchmarking UNIX\v'-.5n'\(dg\v'+.5n' .de DF 'FN .rm DF .. .wh -0.7i DF .de FN .ev 2 .in 0 .sp \\l'1.5i\(ul' .sp 0.5v \v'-.5n'\(dg\v'+.5n' UNIX is a trademark of AT&T .sp 2 .ev .rm FN .ch DF -0.01i .. Systems (MUSBUS), has been developed to assist in .IP (a) identifying bottlenecks and performance problems in new UNIX ports, and .IP (b) providing a robust test environment in which the performance of competing UNIX systems may be compared. .LP This document provides an overview for \fBVersion\fP \fB5.0 (Beta)\fP of MUSBUS and is intended for knowledgeable programmers trying to run the software on their own hardware. .NH 2 Preliminaries .NH 2.1 "Software Environment" .LP You will require a system that supports Level 7, System V or BSD compatibility, along with the following programs. .RS .PR sh (the Bourne shell) .br .PR awk .PR cat .PR cc .PR chmod .PR comm .PR cp .PR date .PR dc .PR df .PR echo .PR ed .PR expr .PR kill .PR ls .PR make .PR mkdir .PR rm .PR sed .PR test .PR time .PR touch .PR tty .PR who .RE .NH 2.2 "Getting Started" .LP All the files are distributed in a single directory. Once these have been retrieved from the distribution some initial housekeeping and system specifics have to be sorted out. .LP When fully installed, MUSBUS contains files in several subdirectories as follows, .IP \(bu .FL Results , log files created by the command procedure .PR run . .IP \(bu .FL Tmp , temporary files created by .PR run and friends. .IP \(bu .FL Tools , post processors to produce .PR tbl input from the log files. .IP \(bu .FL Workload , descriptions of the workload profile, all associated data files and some work script manipulation tools. .LP To explicitly create these directories and distribute the required files into the appropriate places you may .SC "make install" however this will be done automatically by .PR run as required. .LP The file .FL time.awk is used by the command procedure .PR run to average the results from several attempts to time a particular test and so depends upon the format of output from .PR /bin/time . The results from multiple timing attempts are held temporarily in the file .FL Tmp/tmp.$$ (where $$ is the pid of the .PR run shell). Try .SC "/bin/time date" and check the output from .PR /bin/time . If it has a format like .ti +4n 0.4 real 0.0 user 0.1 sys .br then .SC "rm -f time.awk" .SC "ln BSDtime.awk time.awk" .LP If the .PR /bin/time output looks like .RS .ta 8n .nf real 0:00.4 user 0:00.0 sys 0:00.1 .fi .RE then .SC "rm -f time.awk" .SC "ln SysVtime.awk time.awk" .LP Otherwise create your own version of .FL time.awk using .FL *time.awk as examples. .LP Some of the tests require system calls from the C code to measure small elapsed times. This is a real problem since there appears to be no universally correct way of doing this in the Unix world. The particular source files are .FL clock.c , .FL fstime.c and .FL mem.c . In the .FL Makefile , ensure that you have \fBone\fP of the following definitions included in the CFLAGS (in addition to the \(miO). .RS .ta 12n .nf .if t \fH\s-1\(miDSysV\s+1\fP you are using a System V brand of Unix .if n \(miDSysV you are using a System V brand of Unix .if t \fH\s-1\(miDBSD4v2\s+1\fP you are using a Berkeley 4.2 or 4.3 system .if n \(miDBSD4v2 you are using a Berkeley 4.2 or 4.3 system .if t \fH\s-1\(miDBSD4v1\s+1\fP you are using a Berkeley 4.1 system .if n \(miDBSD4v1 you are using a Berkeley 4.1 system .fi .RE For example, .ti +6n .if t \fH\s-1CFLAGS = \(miO \(miDBSD4v2\s+1\fP .if n CFLAGS = -O -DBSD4v2 .LP If \fBnone\fP of these systems is appropriate, the source files \fIwill not compile\fP and you will have to decide on appropriate alternative calls and coding to suit local conditions. .LP Check the .FL HISTORY file (if it exists) for notification of any changes, additions or problems that may have been made or fixed subsequent to the version of MUSBUS described in this document. .LP Try .SC "make programs" to confirm that every necessary program can be compiled and loaded correctly. .LP Now attempt to run all the tests once (this takes roughly 20 minutes). Using the Bourne shell (\c .PR /bin/sh ), .SC "iterations=1" .SC "nusers=1" .SC "export iterations nusers" .SC "./run" .LP This should demonstrate that all the .PR sh, .PR awk, .PR sed and .PR ed scripts can be made to work. Verification of the health of things to this point depends upon checking the output from .PR run to ensure that no nasty errors are reported, and in particular scrolling through the file .FL Results/log . Every time .PR run is used information is \f3appended\fP to .FL Results/log and .FL Results/log.work , so make sure that these files are removed or renamed before you start to do anything serious. The contents of .FL Results/log should look something like .LP .RS .nf .HS Start Benchmark Run (MUSBUS Version X.Y) Tue Jun 23 17:18:21 EDT 1987 (long iterations 6 times) 2 interactive users. Arithmetic Test (type = arithoh): 1000 Iterations Elapsed Time: 0.44 seconds (variance 0.003) CPU Time: 0.30 seconds [ 0.30u + 0.00s ] (variance 0.000) Arithmetic Test (type = register): 1000 Iterations Elapsed Time: 3.36 seconds (variance 0.008) CPU Time: 3.18 seconds [ 3.13u + 0.05s ] (variance 0.008) [ ... and lots more similar goodies ] Output sent to ... /dev/ttyp0 Directories for temporary files ... Tmp .nf .ta 14n,+8n,+8n,+8n,+8n Filesystem kbytes used avail capacity Mounted on /dev/hp0a \07419 \06202 \0\0475 93% / /dev/hp0g 38383 33296 \01248 96% /usr /dev/hp1a \07419 \04169 \02508 62% /jnk /dev/hp1b 15583 \03181 10843 23% /usr/spool /dev/hp1g 38383 32579 \01965 94% /mnt /dev/up1a \07471 \0\0\010 6713 \00% /tmp SIGALRM check: 12 x 5 sec delays takes 60.05 wallclock secs (error -0.08%) Simulated Multi-user Work Load Test: 1 Concurrent Users, each with Input Keyboard Rate 2 chars / sec Elapsed Time: 425.83 seconds (variance 0.125) CPU Time: 27.20 seconds [ 17.30u + 9.90s ] (variance 0.013) 1 interactive users. End Benchmark Run (Wed Jun 24 09:33:55 EDT 1987) .... .HE .fi .RE .LP Beware of lines with the following formats, they indicate something is \fBwrong\fP. .if t .IP "\fH\s-1** Iteration x Failed: text\fP\s+1" .if n .IP "** Iteration x Failed: text" .br Something (\fItext\fP) other than the normally anticipated output from .PR /bin/time was found in the file .FL Tmp/tmp.$$ . .if t .IP "\fH\s-1Elapsed Time: -- no measured results!!\fP\s+1"; .if n .IP "Elapsed Time: -- no measured results!!"; .br Not a single valid timing result was found in .FL Tmp/tmp.$$ . .if t .IP "\fH\s-1Terminated during iteration n\fP\s+1" .if n .IP "Terminated during iteration n" .br Premature termination of a test, usually as the result of a shell trap taken from .PR run . Most often this is symptomatic of an earlier error reported in .FL Results/log . .if t .IP "\fH\s-1* Apparent errors from makework ... *\fP\s+1" .if n .IP "* Apparent errors from makework ... *" .br After cleaning the log files from the multi-user test (using .PR sed and the script .FL check.sed ) some lines remained that probably indicate real errors which forced the multi-user test to terminate prematurely. Depending upon the formats of messages from programs (especially in the multi-user workload), .FL check.sed may need some local fine tuning to remove lines, that do not reflect genuine error conditions, from the log files. If this is not done, the tests will be aborted prematurely based upon the classification of a spurious message as a real error condition. .if t .IP "\fH\s-1Reason?: text\fP\s+1" .if n .IP "Reason?: text" .br .PR Makework (the controlling program for the multi-user test) has detected an inconsistency and taken a fatal dive, \fItext\fP comes from .PR perror () and the previous line in .FL Results/log will contain .PR makework "'s" idea of what is wrong. .if t .IP "\fH\s-1* Benchmark Aborted .... *\fP\s+1" .if n .IP "* Benchmark Aborted .... *" .br Just what it says! .LP Other possible error reports in .FL Results/log relate to specific tests and are either self explanatory (e.g. missing or illegal program options) or described in the Sections below. .LP The file .FL Results/log.work contains detailed logging of the multi-user test, and may contain useful information in the event that this test fails or terminates prematurely. Besides logging process ids and file descriptor assignments for each simulated user's job stream, standard error output is trapped and reported in .FL Results/log.work . .NH 3 "The Tests" .LP If you are serious about the results produced, these tests should be run on a dedicated system without concurrent activity. When possible, an idle system in mult-iuser mode is preferable to a single user system. .LP All the tests are controlled by shell variables used within the command procedure .PR run . By setting environment variables of the same name, the default values of the shell variables may be over-ridden, however if the defaults are consistently wrong for particular variables it is safer (i.e. less error prone) to modify the defaults in .PR run . .LP .PR Run does its work for the most part silently, logging information to certain files, and providing a terse summary of the particular test(s) being run on the tty from which .PR run was invoked. .LP A designated test may by run using the command .SC "./run thing" where \fIthing\fP is one of the test names described in the following Sections. The commands .SC "./run" or .SC "./run all" will run everything. .LP .PR Run may be interrupted from the keyboard (SIGINT) if it is started in foreground and after some fooling about it manages to shut things down and clean up files. .PR run creates a .PR sh command procedure .FL Tmp/kill_run that may be used to shut down a background .PR run via .SC "Tmp/kill_run" .VA iterations 6 Unless otherwise stated, this variable controls the number of times each test is repeated for timing. At the beginning of each iteration, the program .FL iamalive writes the iteration number (without newline or carriage return) on standard output. .NH 3.1 "Raw Speed Measures" .NH 3.1.1 "Specific Arithmetic" .LP This family of tests computes the sum of a series of terms such that the arithmetic is unbiased towards operator type (i.e. equal numbers of additions, subtractions, multiplications and divisions). Each major loop in the computation involves summing 100 terms of the series. .VA arithloop 1000 Number of major loops in the computation. .TN arithoh Do not compute the series, so measures the overhead in the computation. .TN register Arithmetic uses registers. .TN short Arithmetic uses shorts. .TN int Arithmetic uses ints. .TN long Arithmetic uses longs. .TN float Arithmetic uses floats. .TN double Arithmetic uses doubles. .LP After all the arithmetic tests have been performed, the .PR sh script .FL Tools/Adjust should be used with .FL Results/log , i.e. .SC "./Tools/Adjust Results/log" to compute the \fBactual\fP CPU and elapsed times when the overhead measured by the test .PR arithoh is subtracted. It is these times (i.e. \fIminus the startup and loop overhead\fP) that have been published and circulated amongst MUSBUS users. Failure to run the .PR Tools/Adjust script will make the machine you are testing look comparatively worse than it really is! Note that .PR Tools/Adjust will be run automatically by the log file postprocessors (\c .PR Tools/mktbl and .PR Tools/mkcomp ) if the times have not already been adjusted. Once the adjustment has been made, the relevant portion of .FL Results/log should look something like (note \fBActual\fP times in parentheses), .LP .RS .nf .HS Start Benchmark Run (MUSBUS Version X.Y) Tue Jun 23 17:18:21 EDT 1987 (long iterations 6 times) 2 interactive users. Arithmetic Test (type = arithoh): 1000 Iterations Elapsed Time: 0.44 seconds (variance 0.003) CPU Time: 0.30 seconds [ 0.30u + 0.00s ] (variance 0.000) Arithmetic Test (type = register): 1000 Iterations Elapsed Time: 3.36 seconds (variance 0.008) (Actual: 2.92 ) CPU Time: 3.18 seconds [ 3.13u + 0.05s ] (variance 0.008) (Actual: 2.88 ) [ ... and lots more similar goodies ] .HE .fi .RE .NH 3.1.2 "General Purpose Arithmetic" .TN dc Compute the square root of 2 to 99 decimal places using .PR dc . The .PR dc input is in .FL dc.dat . This test is due to John Lions (University of New South Wales) who has suggested it as a good first order measure of raw system speed. .NH 3.1.3 Recursion .VA ndisk 17 .TN hanoi A recursive solution to the classical Tower of Hanoi problem. Work increases as 2**(number of disks). .SV ndisk provides a \fIlist\fP of the number of disks for a \fBset\fP of problems, however the default setting is for a singular set. .NH 3.1.4 "System Calls, Pipes, Forks, Execs and Context Switches" .VA ncall 4000 .TN syscall Sit in a hard loop of .SV ncall iterations, making 5 system calls per iteration. The system calls (\c .PR dup (0), .PR close (i), .PR getpid (), .PR getuid () and .PR umask (i)) involve little work on the part of the UNIX kernel, so the test predominantly measures the overhead associated with the system call mechanism. .VA io 2048 .TN pipe One process (therefore no context switching) that writes and reads a 512 byte block along a pipe .SV io times. .VA children 100 .TN spawn Simply repeat .SV children times; fork a copy of yourself and wait for the child process to exit (which it should do immediately). .VA nexecl 100 .TN execl Perform .SV nexecl execs using .PR execl (). The program to be exec'd has been artificially expanded to a reasonable size (on a VAX, 11264 text + 2048 data + 24388 bss). .VA switch1 500 .TN context1 Perform 2 x .SV switch1 context switches, using pipes for synchronization. The test involves 2 processes connected via 2 pipes. One process writes then reads a 4-byte (descending) sequence number, while the other process reads then writes a sequence number. Synchronization is validated at each swap by checking the values of the sequence numbers read and written. .NH 3.1.5 "C Compilation and Loading" .TN C Measure the time for each of .SC "cc -c cctest.c" .SC "cc cctest.o" where .FL cctest.c contains 124 lines of uninteresting C code (108 lines of real code after .PR cpp ). .NH 3.1.6 "Memory Access Speed" .LP These tests try to measure read accesses per real second into an array of integers. Because of inaccuracies in measuring small real times, the results of this test are subject to large variances and can not be interpreted with great confidence (e.g. negative and infinite speeds have been observed). Consequently, these tests are best considered as a historical curiosities from the days when MMUs were bottlenecks on microporcessor-based systems, and \fBno\fP real significance should be attached to the observed times. .VA poke 100000 Number of array accesses. .VA arrays "8 64 512" List of array sizes in units of 1024 ints. .TN seqmem Cyclic sequential access pattern, hitting each element of the array in turn. .TN randmem Random access patterns -- to give VM systems a chance to do something better! .NH 3.1.7 "Filesystem Throughput" .VA blocks "62 125 250 500" A list of file sizes in Kbytes. .VA where . The directory in which the files will be created. This test requires at least 2 x max(\c .SV blocks ) Kbytes of free space in the filesystem containing .SV where . .TN fstime This program attempts to measure file write time, file read time and file copy time. It is assumed that BUFSIZ as defined in <stdio.h> is a good size for physical i/o, and all i/o is done via direct calls to .PR read () and .PR write (). This test is performed (\c .SV iterations /2) times. .LP Beware of the \fIwrite\fP time, since this can be influenced by the size of the disk block cache in the kernel. Before the reads are commenced there are a couple of .PR sync ()s and a 5 second sleep to try and flush the cache. The times for small files are most sensitive to disk block caching. .LP Really the \fIcopy\fP time for the largest file is the best indicator of throughput and reflects the type of disk activity most commonly generated by compilers, editors, assemblers, etc. Also the rates are measured against elapsed time, so there is some scope for variance however the absolute times are usually long enough to make this effect insignificant \fBprovided\fP there is no concurrent disk activity on the same spindle! .NH 3.2 "Emulated Multi-user Test" .VA nusers "1 4 8 16 24 32" A list of the number users to be emulated. .VA ttys /dev/tty A \fBlist\fP of tty devices where the simulated tty output is sent -- there is a lot of this, and you should ensure that these tty lines are operating at the normal baud rate (e.g. 9600) for the test system. If your CPU console does not use a standard serial multiplexer (e.g. a VAX, Pyramid, Gould, DG, etc.), then the tty output should be directed to \fIsome other\fP tty line(s) that \fBdo\fP use the ordinary serial port hardware. .VA dirs Tmp A \fBlist\fP of directories that will be used to create subdirectories and temporary files to run the user job streams from. .VA rate 2 Users are assumed to type at a rate of .SV rate characters per second. .TN work Of all the tests in MUSBUS, this is the by far the most complicated, most realistic and most likely to fail. This test is performed (\c .SV iterations /2) times. .LP The synthetic workload is created from a number of job streams, each of which is described by a line in the file .FL Tmp/workload . Each line consists of .IP \(bu the home directory for the job stream, .IP \(bu the full pathname of the program to run, .IP \(bu optional arguments to that program, .IP \(bu an optional source of standard input to that program (a filename prefixed by ``<''), and .IP \(bu an optional destination for standard output from that program (a filename prefixed by ``>''). .LP .FL Tmp/workload is created automatically by the command script .PR run based upon .IP (a) the variables .SV dirs and .SV ttys , and .IP (b) the workload profile .FL Workload/script.master from which the script interpreter program name is extracted and the individual input script files (\c .FL Tmp/script.? ). .LP When .FL Tmp/workload is constructed, a cyclic scheme is used to share user work amongst the available directories and tty lines (as per .SV dirs and .SV ttys ). In this way, serial i/o bottlenecks for large numbers of simulated users, and unbalanced disk i/o across spindles may be avoided. As a dynamic check, the program .PR ttychk is used within .PR run to check for potential bandwidth limitations on the serial i/o lines, given the number of lines and the maximum number of job streams. .LP The workload profile (\c .FL Workload/script.master ) has the following format. .IP 1. The first line must begin ``%W%'' followed by the full pathname of the relevant interpreter and any required options. For example, if the script should be run by the Bourne shell, an appropriate specification would be .RS .HS %W% /bin/sh -ie .HE .RE .IP 2. All subsequent lines up to the first line beginning with ``%%'' are preamble commands that must appear at the \f3beginning\fP of \f3every\fP script. .IP 3. Sequences of commands terminated by a line beginning with ``%%'' constitute a job step. Each job step is an autonomous piece of work such that once the preamble has been executed, job steps may be executed in \f3any\fP order. .IP 4. Any lines following the last ``%%'' line form a postscript that must appear at the \f3end\fP of \f3every\fP script. .LP The command procedure .PR mkscript and the program .PR mkperm are used (by .PR run ) to create several (usually 4) scripts from .FL Workload/script/master with random permutations of the job steps. These scripts reside in .FL Tmp/script.? and are assigned in a cyclic manner to create the job streams. The work for \fBeach\fP simulated user is generated from \fBone\fP job stream. .LP For example the distributed .FL Workload/script.master is .LP .RS .nf .if t .HS .CK Workload/script.master %W% /bin/sh -ie mkdir /tmp/$$ tmp %% 1 edit \&./keyb edscr1.dat | ed edit.dat : ....................................................... : . This is some filler of about the same . : . size as the file edscr1.dat, since the . : . emulated input proceeds in parallel, and . : . we want the real-time delay to be about right . : ....................................................... chmod u+w temporary rm temporary %% 2 ls ls -l %% 3 cat cat cat.dat %% 4 compile cc -c cctest.c 1>&2 rm *.o %% 5 edit, compile and link chmod 444 dummy.c \&./keyb edscr2.dat | ed dummy.c : . more textual and time filler for the second edscript file, edscr2.dat . cc dummy.c 1>&2 rm a.* grunt.c %% 6 grep grep '[ ]*nwork' grep.dat %% 7 file copying cp *.c edit.dat /tmp/$$ cp /tmp/$$/* tmp %% rm -rf tmp /tmp/$$ .if t .HE .fi .RE .LP This generates several job streams one of which (\c .FL Tmp/script.1 ) contains, .LP .RS .nf .if t .HS .CK Tmp/script.1 mkdir /tmp/$$ tmp cc -c cctest.c 1>&2 rm *.o \&./keyb edscr1.dat | ed edit.dat : ....................................................... : . This is some filler of about the same . : . size as the file edscr1.dat, since the . : . emulated input proceeds in parallel, and . : . we want the real-time delay to be about right . : ....................................................... chmod u+w temporary rm temporary cat cat.dat grep '[ ]*nwork' grep.dat chmod 444 dummy.c \&./keyb edscr2.dat | ed dummy.c : . more textual and time filler for the second edscript file, edscr2.dat . cc dummy.c 1>&2 rm a.* grunt.c cp *.c edit.dat /tmp/$$ cp /tmp/$$/* tmp ls -l rm -rf tmp /tmp/$$ .if t .HE .fi .RE .LP Given the following environment variable assignments, .RS .nf nusers=8 ttys=/dev/ttyh0 /dev/ttyh8 /dev/ttyha dirs=Tmp /usr/tmp .fi .RE the created workload description file (\c .FL Tmp/workload ) contains .LP .RS .nf .if t .HS Tmp/user1 /bin/sh -ie <Tmp/script.1 >/dev/ttyh0 /usr/tmp/user2 /bin/sh -ie <Tmp/script.2 >/dev/ttyh8 Tmp/user3 /bin/sh -ie <Tmp/script.3 >/dev/ttyha /usr/tmp/user4 /bin/sh -ie <Tmp/script.4 >/dev/ttyh0 Tmp/user5 /bin/sh -ie <Tmp/script.1 >/dev/ttyh8 /usr/tmp/user6 /bin/sh -ie <Tmp/script.2 >/dev/ttyha Tmp/user7 /bin/sh -ie <Tmp/script.3 >/dev/ttyh0 /usr/tmp/user8 /bin/sh -ie <Tmp/script.4 >/dev/ttyh8 .if t .HE .fi .RE .LP It is strongly recommended that you create your own workload profile for the multi-user test to reflect the anticipated system usage. To do this, .IP 1. Use the distributed files in the .FL Workload directory as a guide. .IP 2. Create a new .FL Workload/script.master describing the required job steps. .IP 3. Ensure all required data files are in the .FL Workload directory, because every job stream executes with the current directory containing its own private copies of \f3all\fP the files from .FL Workload . .IP 4. Ensure the makefile (\c .FL Workload/Makefile ) has the following targets defined (they are assumed to exist by .PR run ). .RS .nr II 6n .IP (a) context : ensure all files needed to run a script are present. .IP (b) clean : remove any unnecessary temporary files, e.g. those created from somewhere else during a ``make context''. .IP (c) script.out : run a script and trap all the output; the file .FL script.out should contain the concatenation of the script input and the script output. This file is used by .PR ttychk to compute the output tty bandwidth requirements. .RE .nr II 0 .LP The program .PR makework reads the .FL Tmp/workload file and builds data structures for each job stream (i.e. each simulated user) describing the home directory, command interpreter and its options and standard input and standard output assignments. Thereafter .PR makework starts the user program(s) (\c .PR /bin/sh above) and pumps random chunks of input to them down pipes so that the aggregate rate across all simulated users does not exceed .SV rate \(mu .SV nusers characters per second. .LP Because of process creation limits, this test \fBmust be run as root\fP. .LP Because of open file limits, .PR makework will create \fIclones\fP of itself if there are too many users for it to simulate alone. .LP If the standard input files to the job streams invoke interactive programs (e.g. \c .PR ed ), then substantial care must be taken that the data pumped down the pipe by .PR makework ends up at the correct destination. This has been the cause of some catastrophic problems in which only parts of the job streams have been run by .PR /bin/sh and the rest has been sucked up and thrown away by .PR /bin/ed . .LP To try and avoid these problems, the program .PR keyb has been created to emulate one user typing at a terminal. .PR keyb (like .PR makework ) uses the environment variables .SV rate and .SV tty (as set up by .PR run ) to know how fast to generate output and where the input should be echoed to. .LP Once .PR run work has finished, it is \fBessential\fP that the following checks be performed. .IP (1) Look in the file .FL Results/log . Check for wild variances in the execution times (a sign that not all job streams are being run to completion), and any obscure error messages that would have been generated on stderr from .PR makework . Make sure that the execution times look reasonable. One easy check is that the CPU time for ``n'' users \f2must be at least\fP \&``n'' times the CPU time for one simulated user, since the CPU times should be nearly the same for all users in a given run (although the CPU time per user is expected to rise as the number of concurrent users increases). .IP (2) Check the file .FL Results/log.work that contains echoed comments, status information and shell error output from the simulated user work. .nr II 4n .LP The lines preceded by a line of the form ``Tmp/userlog.nnn:'' are copies of the shell error output for simulated user ``nnn''. This should consist of a row of ``# ''s (assuming root's .PR /bin/sh prompt is ``# ''). .LP The lines preceded by a line of the form ``Tmp/masterlog.nnn:'' are the standard error output from master number ``nnn''. Master 0 is the real master .PR makework process, the others are clones. Check that there are no messages of the forms, .RS .nf .if t .HS makework: cannot open %s for std output makework: chdir to "%s" failed! user %d job %d pid %d done exit code %d user %d job %d pid %d done status 0x%x user %d job %d pid %d done exit code %d status 0x%x clone %d done, pid %d exit code %d clone %d done, pid %d status 0x%x clone %d done, pid %d exit code %d status 0x%x user %d job %d pid %d killed off \&... reason ... pid %d killed off .if t .HE .fi .RE \fBAny\fP these messages indicate something has gone terribly wrong. .LP On the other hand, messages of the form .RS .nf .if t .HS master pid %d clone %d pid %d user %d job %d pid %d pipe fd %d user %d job %d pid %d done clone %d done, pid %d .if t .HE .fi .RE are just warm reassurance that everything is going well. .nr II 0 .NH 3.3 Miscellaneous .TN x Like ``run work'', except the initial filesystem status reporting, tty bandwidth check and clock checks are omitted. Useful when using the multi-user test for diagnostic purposes, and the initial housekeeping is not needed. .NH 4 "The Complete Test" .LP When everything is apparently installed and operating correctly, login as root, choose another inactive terminal running at 9600 baud (/dev/ttyx below) and start the whole charade as follows. .SC tty=/dev/ttyx .SC "export tty" .SC "rm Report/log Report/log.work" .SC "./run &" .LP On a 4 Mbyte VAX 11/780, simulating 1, 4, 8, 16, 24 and 32 users in the multi-user test, this takes about 5 hours to run! .NH 5 "What Does It All Mean?" .LP Finally one should be in a position to contemplate the summaries in the .FL Results/log file. Look in .FL Tools for the scripts .PR mktbl and .PR mkcomp to create .PR tbl input directly from the log files for a single system or to compare two systems. This is most useful in comparison to the same tests run on another system, or on another version of the same system. .LP Lots of things can influence the results, and people interpreting the results should be aware of the following (probably incomplete) list. .IP (1) Available real memory for the disk block cache and user processes. .IP (2) The physical disk hardware; number and type of spindles, controller type and paths to devices. .IP (3) The logical disk arrangement; allocation of critical directories such as .PR /tmp , .PR /usr and user filesystem across physical devices, the number and distribution of swap partitions. .IP (4) Standard of C compiler and optimizer; everything tested is written in C, any improvements here will help everything (even the kernel!). .IP (5) Physical block size for swapping and paging; some of the test programs are very small and so may incur large physical i/o costs. .IP (6) Flavour of UNIX you are using. .IP (7) The accuracy of real time measurements and flow control in .PR makework . Check the output from .PR clock in .FL Results/log at the start of the multi-user test to determine the extent to which a controlling SIGALRM loop measures wallclock time -- this can influence elapsed time in the multi-user test particularly. .NH 6 "Caveat Emptor" .LP The MUSBUS tests have been widely distributed, and in some cases their owners have not treated them kindly. The following list details known pitfalls in running the tests and the subsequent application of the results. .IP (1) There are several versions of the test suite, Version 3.3 (and later versions) in particular are very different to earlier versions and results obtained with different versions of the suite cannot be meaningfully compared. I make no claim for the long-term stability of MUSBUS, and so this evolutionary process is likely to continue with future releases of the test suite. .IP (2) There is \fBno\fP reason to suspect that the distributed workload for the multi-user test (i.e. the files .FL Tmp/script.? and .FL Tmp/workload ) are representative of the user work profile at \fByour\fP installation(s). Be prepared to alter or rebuild the workload to reflect your expected system usage. .IP (3) Results have been known to vary dramatically between releases of the UNIX you are testing. This reflects vendor tuning (sometimes breaking) of the UNIX port and MUSBUS is a useful diagnostic tool in this area, provided the MUSBUS version and workload profile remain fixed. .IP (4) Points (1) to (3) suggest that uncontrolled and uninformed comparisons of MUSBUS results is dangerous in the extreme. This is the main reason that I have not published the large collection of results accumulated to date. .IP (5) Remember that the tests described in Section 3.1 are intended for \fBdiagnostic\fP use. If you are interested in \fBperformance\fP, you should focus upon the multi-user test described in Section 3.2. .IP (6) Beware of simulating \fBtoo few\fP users in the multi-user test. Useful information about system throughput and performance under heavy load conditions can usually be obtained by extrapolation of various measures computed from the CPU and elapsed times for the multi-user tests with various numbers of users. However this assumes the machine has been sufficiently loaded to move out of the \fIlinear\fP part of the performance curves. For very fast machines, this may require emulation of a \fIlarge\fP number of users in the multi-user test. .IP (7) Beware of simulating \fBtoo many\fP users in the multi-user test. Using the default value for .SV ttys , \fBall\fP simulated tty output is directed to a \fIsingle\fP serial port. As you increase the number of simulated users in the multi-user test (in response to Point (6) above) the serial port bandwidth may become the limiting resource! This is easy to fix by adding a list of more tty devices to the value of .SV ttys . .IP (8) Be warned that the multi-user test has ``broken'' several UNIX ports. Causes have been identified as implementation (configuration) limits in the system being tested (e.g. proc slots), real bugs in the port or MUSBUS errors. This list is basically in order of decreasing probability. .LP Communication on MUSBUS experiences is welcomed at the electronic addresses on the first page of this document. If you have found a problem, or can suggest a better testing technique please let me know, so that future versions might offer real (as opposed to cosmetic) enhancements.