⟦9b267033e⟧

TextFile

%%%%
%%%% The Internet Virus of November 3, 1988
%%%% Mark W. Eichin, MIT Project Athena
%%%% 
%%%% $Header: program.latex,v 1.35 89/02/09 21:26:46 jon Locked $
%%%% $Log:	program.latex,v $
%%%% Revision 1.35  89/02/09  21:26:46  jon
%%%% fix unknown label ... stupid mistake
%%%% 
%%%% Revision 1.34  89/02/09  14:09:37  jon
%%%% slight re-wording
%%%% 
%%%% Revision 1.33  89/02/09  00:18:30  jon
%%%% move the information flow figure to the beginning
%%%% 
%%%% Revision 1.32  89/02/08  22:58:34  jon
%%%% remove l1pic and try to fix PATH= formatting problems
%%%% 
%%%% Revision 1.31  89/02/08  18:52:23  jon
%%%% fix awful formatting glitch by using \verbatim
%%%% 
%%%% Revision 1.28  89/02/05  22:31:23  eichin
%%%% Some slight wording changes and some typos.
%%%% 
%%%% Revision 1.27  89/01/28  22:41:56  jon
%%%% fix poor word choice and bad formatting
%%%% 
%%%% Revision 1.26  89/01/27  23:46:26  jon
%%%% fix many typos and some wording
%%%% try to turn the postscript figures into real figures
%%%% 
%%%% Revision 1.25  89/01/20  18:30:41  jon
%%%% fix figure label
%%%% 
%%%% Revision 1.24  89/01/20  00:27:53  jon
%%%% fix mispelling of label and add a figure
%%%% 
%%%% Revision 1.23  89/01/20  00:05:35  jon
%%%% chaper --> section (report --> article)
%%%% 
%%%% Revision 1.22  89/01/19  23:46:51  jon
%%%% typo's pointed out by map.
%%%% 
%%%% Revision 1.21  89/01/19  23:14:29  jon
%%%% reworking of h routine section and hit section plus minor fixes.
%%%% 
%%%% Revision 1.20  89/01/19  01:04:10  jon
%%%% typo fix
%%%% 
%%%% Revision 1.19  89/01/19  00:30:03  jon
%%%% add 2 labels and make references to ports consistent.
%%%% 
%%%% Revision 1.17  89/01/18  14:16:13  jon
%%%% minor changes (prashing, grammar)
%%%% 
%%%% Revision 1.16  88/12/15  22:10:40  jon
%%%% fix typo
%%%% 
%%%% Revision 1.15  88/12/15  22:08:53  jon
%%%% remove GECOS footnote
%%%% 
%%%% Revision 1.14  88/12/13  16:15:19  jon
%%%% \please -> 
%%%% \variable
%%%% 
%%%% Revision 1.13  88/12/10  19:56:17  jon
%%%% minor bogus word fixes and remove second definition of the loopback
%%%% address.
%%%% 
%%%% Revision 1.11  88/11/29  03:56:39  jon
%%%% *** empty log message ***
%%%% 
%%%% Revision 1.10  88/11/28  23:58:07  jon
%%%% add two labels
%%%% 
%%%% Revision 1.9  88/11/28  20:28:25  jon
%%%% *** empty log message ***
%%%% 
%%%% Revision 1.8  88/11/23  04:11:00  eichin
%%%% added pictures.
%%%% 
%%%% Revision 1.7  88/11/23  01:18:11  jon
%%%% remove "stats it" buzzword.
%%%% 
%%%% Revision 1.6  88/11/22  23:33:32  jon
%%%% ran ispell
%%%% 
%%%% Revision 1.5  88/11/22  16:08:35  eichin
%%%% added mkdir label, for strategy reference
%%%% 
%%%% Revision 1.4  88/11/22  11:48:43  eichin
%%%% fixed $$ problems
%%%% 
%%%% Revision 1.3  88/11/22  10:22:30  eichin
%%%% added section on names, and cleaned up some typesetting
%%%% 
%%%% Revision 1.2  88/11/22  07:36:56  eichin
%%%% added most of Bill Sommerfeld's comments. Cleaned up most of type
%%%% style as well.
%%%% 
%%%%
\section{The Program}
\label{progappendix}

\iffalse
XS.c XS96.c checkother.c cksub.c crack1.c crack3.c crack3sub.c
decode.c decode2.c hit_finger.c hit_rexec.c hitshell.c if_init.c
justreturn.c makemagic.c other_sleep.c s1778.c s21e0.c strreverse.c
stuffsmtp.c
\fi

This Appendix describes the virus program subroutine by subroutine.
For reference, the flow of information among the subroutines is
shown in Figure \ref{collectfig}.

\begin{figure*}[tp]
\vspace{7.7in}
\special{psfile=collect.ps hoffset=-36 voffset=-108}
\caption{The structure of the attacking engine.\label{collectfig}}
\end{figure*}

\subsection{Names}
The core of the virus is a pair of binary modules, one for the VAX
architecture and the other for the Sun architecture. These are
linkable modules, and thus have name lists for their internal
procedures. Many of the original names are included here with the
descriptions of the functions the routines performed.

It is surprising that the names are included, and astonishing that
they are meaningful. Some simple techniques, such as randomizing the
procedure names, would have removed a number of clues to the function
of the virus.

\subsection{main}
The main module, the starting point of any C language program, does
some initialization, processes its command line, and then goes off
into the loop which organizes all of the real work.

\subsubsection{Initialization}

The program first takes some steps to hide itself. It changes the
``zeroth'' argument, which is the process name, to \prog{sh}. Thus, no
matter how the program was invoked, it would show up in the process
table with the same name as the Bourne Shell, a program which often
runs legitimately.

The program also sets the maximum core dump size to zero blocks. If
the program crashed\footnote{For example, the virus was originally
compiled using 4.3BSD declaration files. Under 4.2BSD, the alias name
list did not exist, and code such as the virus which assumes aliases
are there can crash and dump core.} it would not leave a core dump
behind to help investigators. It also turns off handling of write
errors on pipes, which normally cause the program to exit.

The next step is to read the clock, store the current time in a local
variable, and use that value to seed the random number generator.

\subsubsection{Command line argument processing}
The virus program itself takes an optional argument \prog{-p} which must
be followed by a decimal number, which seems to be a process id of the
parent which spawned it. It uses this number later to kill that
process, probably to ``close the door'' behind it.

The rest of the command line arguments are ``object names''. These are
names of files it tries to load into its address space. If it can't
load one of them, it quits. If the \prog{-p} argument is given, it
also deletes the object files, and later tries to remove the disk
image of running virus, as well as the file
\filename{/tmp/.dumb}. (This file is not referenced anywhere else in
the virus, so it is unclear why it is deleted.)

The program then tried a few further steps, exiting (``bailing out'')
if any of them failed:
\begin{items}
\item It checked that it had been given at least one object on the
command line.
\item It checked to see if it had successfully loaded in the object
\filename{l1.c}.
\end{items}

If the ``-p'' argument was given, the program closes all file
descriptors, in case there are any connections open to the parent.

The program then erases the text of the argument array,
to further obscure how it was started (perhaps to hide anything if one
were to get a core image of the running virus.)

It scans all of the network interfaces on the machine, gets the flags
and addresses of each interface. It tries to get the point-to-point
address of the interface, skipping the loopback address. It also
stores the netmask for that network \cite{rfc950}.

Finally, it kills off the process id given with the ``-p'' option. It
also changes the current process group, so that it doesn't die when
the parent exits. Once this is cleaned up, it falls into the
\subroutine{doit} routine which performs the rest of the work.

\subsection{doit routine}
This routine is where the program spends most of its time.

\subsubsection{Initialization}
Like the main routine, it seeds the random number generator with the
clock, and stores the clock value to later measure how long the virus
has been running on this system.

It then tries \subroutine{hg}. If that fails, it tries
\subroutine{hl}. If that fails, it tries \subroutine{ha}.

It then tries to check if there is already a copy of the virus running
on this machine. Errors in this code contributed to the large amounts
of computer time taken up by the virus. Specifically:
\begin{items}
\item On a one-in-seven chance, it won't even try to test for another
virus.
\item The first copy of the virus to run is the only one which
listens for others; if multiple infections occur ``simultaneously''
they will not ``hear'' each other, and all but one will fail to listen
(see section~\ref{interlock}).
\end{items}

The remainder of the initialization routine seems designed to send a
single byte to address 128.32.137.13, which is \host{ernie.berkeley.edu},
on port 11357. This never happens, since the author used the
\subroutine{sendto} function on a TCP stream connection, instead of a
UDP datagram socket.\footnote{If the author had been as careful with
error checking here as he tried to be elsewhere, he would have noted
the error ``socket not connected'' every time this routine is
invoked.} We have no explanation for this; it only tries to send this
packet with a one in fifteen random chance.

\subsubsection{Main loop}
An infinite loop comprises the main active component of the virus. It
calls the \subroutine{cracksome} routine\footnote{This name was actually in
the symbol table of the distributed binary!} which tries to find some
hosts that it can break in to. Then it waits 30 seconds,
listening for other virus programs attempting to break in, and tries
to break into another batch of machines.

After this round of attacks, it forks, creating two copies of the
virus; the original (parent) dies, leaving the fresh copy. The child
copy has all of the information the parent had, while not having the
accumulated CPU usage of the parent. It also has a new process id,
making it hard to find.

Next, the \subroutine{hg}, \subroutine{hl}, and \subroutine{ha}
routines search for machines to infect (see Appendix \ref{hroutines}). The
program sleeps for 2 minutes, and then checks to see if it has been
running for more than 12 hours, cleaning up some of the entries in the
host list if it has.

\label{pleasequit}
Finally, before repeating, it checks the global variable
\variable{pleasequit}. If it is set, {\it and}\ if it has tried more
than 10 words from its own dictionary against existing passwords, it
quits. Thus forcing \variable{pleasequit} to be set in the system
libraries will do very little to stem the progress of this
virus\footnote{Although it was suggested very early \cite{spaf1}.}.

\subsection{Cracking routines}
This collection of routines is the ``brain'' of the virus.
\subroutine{cracksome}, the main switch, chooses which of four
strategies to execute. It is would be the central point for adding
new strategies if the virus were to be further extended. The virus
works each strategy through completely, then switches to the next
one. Each pass through the cracking routines only performs a small
amount of work, but enough state is remembered in each pass to
continue the next time around.

\subsubsection{cracksome}
The \subroutine{cracksome} routine is the central switching routine of the
cracking code. It decides which of the cracking strategies is actually
exercised next. Again, note that this routine was named in the global
symbol table. It could have been given a confusing or random name, but
it was actually clearly labelled, which lends some credence to the
idea that the virus was released prematurely.

\subsubsection{Phase 0}
\label{phase0}
The first phase of the \subroutine{cracksome} 
routines reads through the \filename{/etc/hosts.equiv} file
to find machine names that would be likely targets. While this file
indicates what hosts the current machine trusts, it is fairly common
to find systems where all machines in a cluster trust each other, and
at the very least it is likely that people with accounts on this machine will
have accounts on the other machines mentioned in \filename{/etc/hosts.equiv}.

It also reads the \filename{/.rhosts} file, which lists the set of machines
that this machine trusts root access from. Note that it does not take
advantage of the trust itself \cite{baldwin} but merely
uses the names as a list of additional machines to attack. Often,
system managers will deny read access to this file to any user other
than root itself, to avoid providing any easy list of secondary
targets that could be used to subvert the machine; this practice would 
have prevented the virus from discovering those names, although
\filename{/.rhosts} is very often a subset of \filename{/etc/hosts.equiv}.

The program then reads the entire local password file,
\filename{/etc/passwd}. It uses this to find personal \filename{.forward}
files, and reads them in search of names of other machines it can
attack. It also records the user name, encrypted password, and {\it
GECOS}
%%%\footnote{one of the fields of the UNIX user information data.
%%%It is only named in the documentation on changing it, the original
%%%meaning seems to have been lost.}
information string, all of which are
stored in the \filename{/etc/passwd} file. Once the program 
scanned the entire file, it advanced to Phase 1.

\subsubsection{Phase 1}
\label{obviouspw}
This phase of the cracking code attacked passwords on the local
machine. It chose several likely passwords for each user, which were
then encrypted and compared against the encryptions obtained in Phase
0 from \filename{/etc/passwd}:
\begin{items}
\item No password at all.
\item The user name itself.
\item The user name appended to itself.
\item The second of the comma separated {\it GECOS} information fields,
which is commonly a nickname.
\item The remainder of the full name after the first name in the
{\it GECOS} fields, i.e. probably the last name, with the first letter
converted to lower case.
\item This ``last name'' reversed.
\end{items}

All of these attacks are applied to fifty passwords at a time from
those collected in Phase 0. Once it had tried to guess the
passwords for all local accounts, it advanced to Phase 2.

\subsubsection{Phase 2}
Phase 2 takes the internal word list distributed as part of the
virus (see Appendix~\ref{dict}) and shuffles it. Then it takes
the words one at a time and decodes them (the high bit is set on all
of the characters to obscure them) and tries them against all
collected passwords. It maintains a global variable \variable{nextw}
as an index into this table. The main loop uses this to
prevent \variable{pleasequit} from causing the virus to exit until
at least ten of the words have been checked against all of the
encryptions in the collected list.

Again, when the word list is exhausted the virus advances to Phase 3.

\subsubsection{Phase 3}
Phase 3 looks at the local \filename{/usr/dict/words} file, a 24474
word list distributed with 4.3BSD (and other UNIX systems) as a
spelling dictionary. The words are stored in this file one word per
line. One word at a time is tried against all encrypted passwords. If
the word begins with an upper case letter, the letter is converted to
lower case and the word is tried again.

When the dictionary runs out, the phase counter is again advanced to 4
(thus no more password cracking is attempted).

\subsection{H routines}
\label{hroutines}
The ``h routines'' are a collection of routines with short names,
such as \subroutine{hg}, \subroutine{ha}, 
\subroutine{hi}, and \subroutine{hl}, which search for other hosts to attack.

\subsubsection{hg}
The \subroutine{hg} routine calls \subroutine{rt\_init} (if it has not
already been called) to scan the routing table, and records all
gateways except the loopback address in a special list.  It then tries
a generic attack routine to attack via \prog{rsh},
\prog{finger}, and SMTP.  It returns after the first successful attack.

\subsubsection{ha}
\label{ha}
The \subroutine{ha} routine goes through the gateway list and
connects to TCP port 23, the telnet port, looking for gateways which are
running telnet listeners.  It randomizes the order of such gateways
and calls 
\subroutine{hn} (our name) with the network number of each gateway.
The \subroutine{ha} returns after \subroutine{hn} reports that it has
succeeded broken into a host.

\subsubsection{hl}
The \subroutine{hl} routine iterates through all the addresses for the
local machine calling \subroutine{hn} with the network number for
each one.  It returns if \subroutine{hn} indicates success in breaking
into a host.

\subsubsection{hi}
The \subroutine{hi} routine goes through the internal host list (see
section~\ref{phase0}) and tries to attack each host via \prog{rsh},
\prog{finger}, and SMTP.  It returns if when one host is infected.

\subsubsection{hn}
\label{hn}
The \subroutine{hn} routine (our name) followed \subroutine{hi} takes a
network number as an argument.  Surprisingly it returns if the network
number supplied is the same as the network number of any of the
interfaces on the local machine.  For Class A addresses it uses the
Arpanet IMP convention to create possible addresses to attack
(net.[1-8].0.[1-255]).  For all other networks it guesses hosts number
one through 255 on that network.  It randomizes the order of this list of possible
hosts and tries to attack up to twenty of them using \prog{rsh},
\prog{finger}, and SMTP.  If a host does not accept connections on
TCP port 514, the \prog{rsh} port, \subroutine{hn} will not try to
attack it. If a host is successfully attacked \subroutine{hn} returns.

\subsubsection{Usage}
The ``h routines'' are called in groups in the main loop; if the first
routine succeedes in finding a vulnerable host the remaining routines
are not called in the current pass. 
Each routine returns after it finds one vulnerable host.  The
\subroutine{hg} routine is always called first, which indicates the virus
really wanted to infect gateway machines.  Next comes \subroutine{hi}
which tried to infect normal hosts found via \subroutine{cracksome}.
If \subroutine{hi} fails, \subroutine{ha} is called, which seemed to try
breaking into hosts with randomly guessed addresses on the far side of
gateways.  This assumes that all the addresses for gateways had been
obtained (which is not trivial to verify from the convoluted code in
\subroutine{{rt init}}),
and implies that the virus would prefer to infect a gateway and from
there reach out to the gateway's connected networks, rather than
trying to hop the gateway directly.  If \subroutine{hg},
\subroutine{hi}, and \subroutine{ha} all failed to infect a host, then
\subroutine{hl} is called which is similar to \subroutine{ha} but uses
for local interfaces for a source of networks.

It is not clear that \subroutine{ha} and \subroutine{hl} worked.
Because \subroutine{hn} returns if the address is local,
\subroutine{hl} appears to have no chance of succeeding.  If alternate
addresses for gateways are indeed obtained by other parts of the virus
then \subroutine{ha} could work.  But if only the addresses in the
routing table were used it could not work, since by definition these
addresses must be on a directly connected network.  Also, in our
monitoring we never detected an attack on a randomly generated
address.  These routines do not seem to have been functional.

\subsection{Attack routines}
There are a collection of attack routines, all of which try to obtain a
Bourne Shell running on the targeted machine. See Appendix \ref{hook} for a
description of the
\filename{l1.c} program, used by all the attack routines.


\subsubsection{hu1}
The \subroutine{hu1} routine is called by the Phase 1 and Phase 3
\subroutine{cracksome} subroutines. Once a password for user name
guessed correctly, this routine is called with a host name read from
either the user's
\filename{.forward} or \filename{.rhosts} files.  In order to assume
the user's id it then tries to
connect to the local machine's \prog{rexec} server using the guessed
name and password.  If successful it runs an \prog{rsh} to
the target machine, trying to execute a Bourne Shell, which it uses to
send over and compile the
\filename{l1.c} infection program. 

\subsubsection{Hit SMTP}
This routine make a connection to TCP port 25, the SMTP port, of a remote
machine and used it to take advantage of the sendmail bug. It
attempts to use the debug option to make \prog{sendmail} run a
command (the ``recipient'' of the message), which transfers the
\filename{l1.c} program included in the body of the message.

\subsubsection{Hit finger}
The ``hit finger'' routine tries to make a connection to TCP port 79, the finger
port, of the remote machine. Then it creates a ``magic packet'' which
consists of
\begin{items}
\item A 400 byte ``runway'' of VAX ``nop'' instructions, which can be
executed harmlessly.
\item A small piece of code which executes a Bourne Shell.
\item A stack frame, with a return address which would hopefully point
into the code.
\end{items}

Note that the piece of code is VAX code, and the stack frame is a VAX
frame, in the wrong order for the Sun. Thus, although the Sun finger
daemon has the same bug as the VAX one, this piece of code cannot
exploit it.

\begin{figure*}[tp]
\begin{verbatim}
        pushl   $68732f         push '/sh<NUL>'
        pushl   $6e69622f       push '/bin'
        movl    sp,r10          save address of start of string
        pushl   $0              push 0 (arg 3 to execve)
        pushl   $0              push 0 (arg 2 to execve)
        pushl   r10             push string addr (arg 1 to execve)
        pushl   $3              push argument count
        movl    sp,ap           set argument pointer
        chmk    $3b             do "execve" kernel call.
\end{verbatim}
\caption{VAX intructions for the \prog{finger} attack.\label{vaxcode}}
\end{figure*}

The attack on the finger daemon is clearly a lysogenetic ``viral''
attack (see Section~\ref{rose}), since although a worm doesn't modify
the host machine at all, the finger attack does modify the running
finger daemon process.  The ``injected DNA'' component of the virus
contained the VAX instructions shown in Figure~\ref{vaxcode}.

The \subroutine{execve} system call causes the current process to be
replaced with an invocation of the named program; \filename{/bin/sh}
is the Bourne Shell, a UNIX command interpreter.  In this case, the
shell winds up 
running with its input coming from, and its output going to, the
network connection.  The virus then sends over the \filename{l1.c}
bootstrap program.

\subsubsection{Hit rsh}
This unlabeled routine
%%%\footnote{at $21e0_{16}$}
tries \prog{rsh} to the
target host (assuming it can get in as the current user). It tries
three different names for the \prog{rsh} binary,
\begin{items}
\item \filename{/usr/ucb/rsh}
\item \filename{/usr/bin/rsh}
\item \filename{/bin/rsh}
\end{items}
If one of them succeeds, it tries to resynchronize (see
Appendix~\ref{resynch}) the connection; if that doesn't succeed within
thirty seconds it kills off the child process. If successful the
connection can then be used to launch the \filename{l1.c} ``grappling
hook'' program at the victim.

Note that this infection method doesn't specify a user name to attack;
if it gets into the remote account, it is because the user that the
virus is running as also has an 
%%% due to the user having an
account on the other machine which trusts the originating machine.

\subsubsection{Hit rexec}
The \subroutine{{hit rexec}} routine uses the remote
execution system which is similar to \prog{rsh}, but designed for use by
programs. It connects and sends the user name, the password, and
\filename{/bin/sh} as the command to execute. 
%%% It checks to see if the
%%% request succeeded, and tries to use one of the password/account
%%% pairs guessed earlier to get in.

\subsubsection{makemagic}
This routine tries to make a telnet connection to each of
the available addresses for the current victim. It broke the connections
immediately, often producing error reports from the telnet
daemon, which were recorded, and provide some of the earliest reports
of attack attempts.\footnote{On fast machines, such as the DEC VAX
3200, there may be no record of these attacks, since the connection is
handed off fast enough to satisfy the daemon.}

If it succeedes in reaching the host, it creates a TCP listener on a
random port number which the infected machine would eventually connect
back to.

\subsection{Grappling Hook}
\label{hook}
A short program, named \filename{l1.c}, is the common grappling
hook 
\iffalse
(See section~\ref{l1pic} for an overview.)
\fi
that all of
the attack routines use to pull over the rest of the virus. It is robustly
written, and fairly portable. It ran on a number of machines which
were neither VAX or Sun, loading them down as well, but only making
them peripheral victims of the virus.

The first thing it does is delete the binary it was running from. It
checks that it has three arguments (exiting if there aren't three of
them). It closes all file descriptors and then forks, exiting if the
fork fails.  If it succeeds, the parent exits; this leaves no
connection from the child to the infection route.

Next, it creates a TCP connection back to the address given as
the first argument, and the port given as the second. Then it sends
over the magic number given as the third. The text of each argument is
erased immediately after it is used. The stream connection is then
reused as the program's standard input and output.

A loop reads in a length (as a network byte order 32-bit integer) and
then a filename.  The file is unlinked and opened for write, and then
the file itself is read in (using the number of bytes read in
earlier.) On any error, all of the files are unlinked. If the length
read in is -1, the loop exits, and a Bourne Shell is executed
(replacing the \filename{l1} program, and getting its input from the
same source.)

\subsection{Install Routines}
There are a variety of routines used to actually move the virus from
one machine to the other. They deal with the ``virus protocol''
connection made by the \filename{l1.c} injected program or with the shell
that it spawns.

\subsubsection{resynch}
\label{resynch}
The \subroutine{resynch} routine sends commands to a remote shell,
requesting that it echo back a specific randomly chosen number. It
then waits a certain amount of time for a response. This routine is
used to indicate when the various subprograms of the infection procedure
have compiled or executed and a Bourne Shell prompt is available
again.

\subsubsection{waithit}
\label{mkdir}
This routine does much of the high level work. It waits (up to 2
minutes) for a return connection from a victim (which has had
\filename{l1.c} injected into it.) It then tries to read a magic
number (which had been previously sent to that victim as a command
line argument to the \filename{l1} program) and gives up after ten
seconds.

After the connection is established, all of the current ``objects'' in
storage in the virus are fed down the connection into the victim. Then
it tries to resynchronize, and if it succeeds, sends down commands to
\begin{items}
\item set the \command{PATH} of the victim shell
\item try to delete \prog{sh} in the current directory (\filename{/usr/tmp})
\item if the delete fails, pick a random name to use
instead~\footnote{Since the delete command used (\command{rm -f}) did not
remove directories, creating a directory \filename{/usr/tmp/sh}
stoped the virus\cite{spaf2}.  However, the virus would still use CPU
resources attempting to link the objects, even though it couldn't
write to the output file (since it was a directory).} 
\item scan the list of objects, looking for names ending in \filename{.o}
\item link and run each of these, with the command line arguments
\begin{items}
\item \command{-p \$\$}, where \command{\$\$} is the process id of the victim
shell
\item each object name
\end{items}
\item resynchronize; if this fails, assume that the virus succeeded
(since the \command{-p} option tells the virus to kill off the parent
shell) and set flag bit 1 of the host list entry (the host list is
detailed in section~\ref{hosts}).
\item delete the compiled program, and go on to the next object.
\end{items}

Thus, to add another machine type, the virus merely needs to be
started with a new object binary as a command line option, which will
then be propagated to the next infected host and tried.

Note that the path used here was
\filename{PATH=} \filename{bin: } \filename{/usr/bin: } \filename{/usr/ucb}
which is certainly reasonable on most systems. This protects
systems with ``unusual'' filesystem layouts, and suggests that complete
consistency among systems makes them more vulnerable.

\subsection{Host modules}
\label{hosts}
These are a set of routines designed to collect names and addresses of
target hosts in a master list.  Each entry contains up to six
addresses, up to twelve names, and a flags field.

\subsubsection{Name to host}
This routine searches the host list for a given named host, returns
the list entry describing it, and optionally adds it to the list if it
isn't there already.

\subsubsection{Address to host}
This routine searches the host list for a given host address, returns
the list entry describing it, and optionally adds it to the list if it
isn't there already.

\subsubsection{Add address/name}
These two routines added an address or a name to a host list entry,
checking to make sure that the address or name was not already known.

\subsubsection{Clean up table}
This routine cycles through the host list, and removes any hosts which
only have flag bits 1 and 2 set (and clears those bits.) Bit 1 is set
when a resynchronize (in \subroutine{waithit}) fails, probably indicating
that this host ``got lost''. Bit 2 is set when a named host has no
addresses, or when several different attack attempts fail. Bit 3 is
set when Phase 0 of the crack routines successfully retrieves an
address for the host.

\subsubsection{Get addresses}
This routine takes an entry in the host table and tries to fill in the the
gaps. It looks up an address for a name it has, or looks up a name for the
addresses it has. It also includes any aliases it can find.

\subsection{Object routines}
These routines are what the system uses to pull all of its pieces into
memory when it starts (after the host has been infected) and then to
retrieve them to transmit to any host it infects.

\subsubsection{Load object}
This routine opens a file, determines its length, allocating the
appropriate amount of memory, reads it in as one block, decodes the
block of memory (with XOR). If the object name contains a comma, it
moves past it and starts the name there.

\subsubsection{Get object by name}
This routine returns a pointer to the requested object. This is used
to find the pieces to download when infecting another host.

\subsection{Other initialization routines}

\subsubsection{if init}
This routine scans the array of network interfaces. It gets the flags
for each interface, and makes sure the interface is UP and RUNNING
(specific fields of the flag structure). If the entry is a point
to point type interface, the remote address is saved and added to the
host table. It then tries to enter the router into the list of hosts
to attack.

\subsubsection{rt init}
This routine runs \prog{netstat -r -n} as a subprocess. This shows the
routing table, with the addresses listed numerically. It gives up
after finding 500 gateways. It skips the default route, as well as the
loopback entry. It checks for redundant entries, and checks to see if
this address is already an interface address. If not, it adds it
to the list of gateways.

After the gateway list is collected, it shuffles it and enters the
addresses in the host table.

\subsection{Interlock routines}
\label{interlock}
The two routines \subroutine{checkother} and \subroutine{othersleep}
are at the heart of
the excessive propagation of the virus. It is clear that the author
intended for the virus to detect that a machine was already infected, and
if so to skip it. The code is actually fraught with timing flaws and
design errors which lead it to permit multiple infections, probably more
often than the designer intended\footnote{This behavior was noted by both
looking at the code and by creating a testbed setup, manually running a
program that performs the checking and listening functions.}.

An active infection uses the \subroutine{othersleep} routine for two
purposes, first to sleep so that it doesn't use much processor
time, and second to listen for requests from ``incoming'' viruses. The
virus which is running \subroutine{othersleep} is referred to as the
``listener'' and the virus which is running \subroutine{checkother} is
referred to as the ``tester''.

\subsubsection{Checkother}
The tester tries to connect to port 23357 on the local machine (using
the loopback address, 127.0.0.1) to see if it can connect to a
listener. If any errors occur during this check, the virus assumes
that no listener is present, and tries to become a listener itself.

If the connection is successful, the checker sends a magic
number\footnote{$874697_{16}, 8865431_{10}, 041643227_{8}$}, and listens
(for up to 300 seconds) for a magic number from the
listener\footnote{$148898_{16}, 1345688_{10}, 05104230_{8}$}.  If the
magic number is wrong, the checker assumes it is being spoofed and
continues to run.

The checker then picks a random number, shifts it right by three
(throwing away the lower three bits) and sends it to the listener. It
expects a number back within ten seconds, which it adds to the one
sent. If this sum is even, the sender increments \variable{pleasequit},
which (as noted in section~\ref{pleasequit}) does very little.

Once it has finished communicating (or failing to communicate) with
the listener, the checker sleeps for five seconds and tries to become
a listener. It creates a TCP stream socket, sets the socket
options to indicate that it should allow multiple binds to that
address (in case the listener {\it still} hasn't exited, perhaps?) and
then binds the socket to port 23357, and listens on it (permitting a
backlog of up to ten pending connections.) 

\subsubsection{Othersleep}
The \subroutine{othersleep} routine is run when the main body of the virus
wants to idle for a period of time. This was apparently intended to help the
virus ``hide'' so that it wouldn't use enough processor time to be
noticed. While the main program sleeps, the listener code waits to see
if any checkers have appeared and queried for the existence of a
listener, as a simple ``background task'' of the main virus.

The routine first checks to see if it has been set up as a listener;
if not, it calls the normal \subroutine{sleep} function to sleep for the
requested number of seconds, and returns.

If it is set up as a listener, it listens on the checking port with a
timeout. If it times out, it returns, otherwise it deals with the
connection and subtracts the elapsed real time from the time out
value.

The body of the listener ``accepts'' the connection, and sends a magic
number to the checker. It then listens (for up to 10 seconds) for the
checker's magic number, and picks a random number. It shifts the
random number right by three, discarding the lower bits, and sends it
up to the checker; it then listens (for up to 10 seconds) for a random
number from the checker. If any of these steps fail, the connection is
closed and the checker is ignored.

Once the exchanges have occurred, the address of the incoming
connection is compared with the loopback address.
If it is not from the loopback address, the attempt is ignored. If it
is, then if the sum of the exchanged random numbers is odd, the
listener increments \variable{pleasequit} (with little effect, as
noted in section~\ref{pleasequit}) and closes the listener connection.

\iffalse
\subsection{Flow Diagrams}
\label{diagrams}
The following diagrams show the flow of data and control in some of
the more complex parts of the virus. The first shows how the
``grappling hook'' (\prog{l1.c}) was launched from an infected machine
into a victim machine. The second shows how the virus gathers
information about the host it has infected and organizes an attack on
another host.

\subsubsection{l1 Infection Path}\label{l1pic}

The common infection mechainism used to copy the virus into a vulnerable host is shown in Figure \ref{l1fig}.

\begin{figure*}[tp]
\vspace{3.75in}
\special{psfile=l1picture.ps hoffset=-108 voffset=-468}
\caption{The common infection mechanism.\label{l1fig}}
\end{figure*}
\fi
DataMuseum.dk

DKUUG/EUUG Conference tapes

⟦9b267033e⟧ TextFile

Derivation

TextFile