⟦e7463cca5⟧

TextFile

\input texinfo @c -*-texinfo-*-

@setfilename tar-info
@settitle The @code{tar} Manual: DRAFT
@ifinfo
This file documents the tape archive of the GNU system.

Copyright (C) 1988 Free Software Foundation, Inc.

Permission is granted to make and distribute verbatim copies of this
manual provided the copyright notice and this permission notice are
preserved on all copies.

@ignore
Permission is granted to process this file through TeX and print the
results, provided the printed document carries copying permission
notice identical to this one except for the removal of this paragraph
(this paragraph not being relevant to the printed manual).

@end ignore
Permission is granted to copy and distribute modified versions of
this manual under the conditions for verbatim copying, provided that
the entire resulting derived work is distributed under the terms of
a permission notice identical to this one.

Permission is granted to copy and distribute translations of this
manual into another language, under the above conditions for
modified versions.
@end ifinfo

@iftex
@finalout
@end iftex

@titlepage
@sp 11
@center @titlefont{tar}
@sp 1
@center The GNU tape archive
@sp 2
@center by Jay Fenlason
@sp 2
@center DRAFT!
@sp 1
@center @today
@page
@vskip 0pt plus 1filll

This manual describes the GNU tape archiver, @code{tar}, and how you
can use it to store copies of a file or a group of files in an
@dfn{archive}.  This archive may be written directly to a magnetic
tape or other storage medium, stored as a file, or sent through a
pipe to another program.  @code{tar} can also be used to add files
to an already existing archive, list the files in an archive, or
extract the files in the archive.
@sp 2
GNU @code{tar} was written by John Gilmore, and modified by many
people.  The GNU enhancements were written by Jay Fenlason.
@page
@vskip 0pt plus 1filll
Copyright @copyright{} 1988 Free Software Foundation, Inc.

Permission is granted to make and distribute verbatim copies of
this manual provided the copyright notice and this permission notice
are preserved on all copies.

@ignore
Permission is granted to process this file through Tex and print the
results, provided the printed document carries copying permission
notice identical to this one except for the removal of this paragraph
(this paragraph not being relevant to the printed manual).

@end ignore
Permission is granted to copy and distribute modified versions of this
manual under the conditions for verbatim copying, provided that the entire
resulting derived work is distributed under the terms of a permission
notice identical to this one.

Permission is granted to copy and distribute translations of this manual
into another language, under the same conditions as for modified versions.

@end titlepage

@ifinfo
@node Top, Why, , (dir)
@ichapter Using the Tape Archiver

You can use the GNU tape archiver, @code{tar}, to store copies of a file
or a group of files in an @dfn{archive}.  This archive may be written
directly to a magnetic tape or other storage medium, stored as a file,
or sent through a pipe to another program.  @code{tar} can also be used
to add files to an already existing archive, list the files in an
archive, or extract the files in the archive.

@menu
* Why::			What @code{tar} archives are good for.
* Commands::		How to tell @code{tar} what to do.
* Options::		Options that change the way @code{tar} behaves.
* FullDumps::		Using @code{tar} to perform full dumps.
* IncDumps::		Using @code{tar} to perform incremental dumps.
* Problems::		Common problems using @code{tar}.
* Rem Tape::		The remote tape server.
* Format::		The format of a @code{tar} archive.
@end menu
@end ifinfo

@node Why, Commands, Top, Top
@chapter The Uses of Tape Archives

The tape archiver @code{tar} allows you to store many files in an
@dfn{archive file} or @dfn{tar file} which describes the names and contents
of the constituent files.  Later you can extract some or all of these files
from the archive.

Tar files are not restricted to magnetic tapes.  The @code{tar} program
can equally well use an ordinary file, or a pipe, or any device, as the
archive.  But they were originally designed for use with magnetic tapes,
and that is how the name ``tar'' came about.

Archive files can be used for transporting a group of files from one system
to another:  put all relevant files into an archive on one computer system,
transfer the archive to another, and extract the contents there.  The basic
transfer medium might be magnetic tape, Internet FTP, or even electronic
mail (though you must encode the archive with @code{uuencode} in order to
transport it properly by mail).  Both machines do not have to use the same
operating system, as long as they both support the @code{tar} program.

A magnetic tape can store several files in sequence, but has no names for
them, just relative position on the tape.  A tar file or something like it
is the only way to store several files on one tape and retain their names.
Even when the basic transfer mechanism can keep track of names, as FTP can,
the nuisance of handling multiple files, directories, and multiple links,
may make a tar file a much easier method.

Archive files are also used for long-term storage, which you can think
of as transportation from one time to another.

Piping one @code{tar} to another is an easy way to copy a directory's
contents from one disk to another, while preserving the dates, modes, owners
and link-structure of all the files therein.

The GNU version of @code{tar} has special features that allow it to be
used to make incremental and full dumps of all the files in a
filesystem.

@node Commands, Options, Why, Top
@chapter The Different Operations @code{tar} Can Perform

One program, @code{tar}, is used to create an archive, to extract files
from an archive, to modify an archive, or to list the contents.  Each
time you run @code{tar}, you must give a @dfn{command} to specify which
one of these things you want to do.

The command must always be in the first argument to @code{tar}.  This
argument can also contain options (@pxref{Options}).  For compatibility
with Unix @code{tar}, the first argument is always treated as containing
command and option letters even if it doesn't start with @samp{-}.  Thus,
@samp{tar c} is equivalent to @samp{tar -c}: both of them specify the
@samp{-c} command to create an archive.

The remaining arguments to @code{tar} are either options, if they start
with @samp{-}, or files to operate on.

The file names that you give as argument are the files that @code{tar} will
act on--for example, they are the files to put in the archive, or the files
to extract from it.  If you don't give any file name arguments, the default
depends on which command you used.  Some commands use all relevant files;
some commands have no default and will report an error if you don't specify
files.

If a file name argument actually names a directory, then that directory
and all files and subdirectories in it are used.

Here is a list of all the @code{tar} commands:

@table @samp
@item -c
The @samp{-c} command tells @code{tar} to create a new archive that
contains the file(s) specified on the command line.  If you don't
specify files, all the files in the current directory are used.

If the archive file already exists, it is overwritten; the old contents
are lost.

@item -d
The @samp{-d} command causes @code{tar} to compare the archive with
the files in the file system.  It will report differences in file
size, mode, owner, and contents.  If a file exists in the archive, but
not in the file system, @code{tar} will report this.

If you specify file names, those files are compared with the tape and
they must all exist in the archive.  If you don't specify files, all
the files in the archive are compared.

@item -r
The @samp{-r} command causes @code{tar} to add the specified file(s)
to the end of the archive.  This assumes that the archive file already
exists and is in the proper format (which probably means it was
created previously with the @code{tar} program).  If the archive is
not in a format that @code{tar} understands, the results will be
unpredictable.

You must specify the files to be used; there is no default.

@item -t
The @samp{-t} command causes @code{tar} to display a list of the files
in the archive.  If you specify file names, only the files
that you specify will be mentioned (but each of them is mentioned only
if it appears in the archive).

@item -u
The @samp{-u} command causes @code{tar} to add the specified files to
the end of the archive, like @w{@samp{-r}}, but only when a file doesn't
already exist in the archive or is newer than the version in the
archive (last-modification time is compared).  This command can be
very slow.

You must specify the files to be used; there is no default.

@item -x
The @samp{-x} command causes @code{tar} to extract the specified files
from the archive.  If no file names are given, all the files in the
archive will be extracted.

@item -A
The @samp{-A} command is used for concatenating several archive files
into one big archive file.  The files to operate on should all be
archive files.  They are all appended to the end of @emph{the} archive
file which @code{tar} works on.  (The other files are not changed).

You might be tempted to use @code{cat} for this, but it won't
ordinarily work.  A @code{tar} archive contains data which indicates
the end of the archive, so more material added to the end with
@code{cat} would be ignored.  The @samp{tar -A} command works because
it removes the end-of-archive markers from the middle of the result.

@item -D
The @samp{-D} command causes @code{tar} to delete the specified files
from the archive.  This command is extremely slow.  Warning:  Use of
this command on archives stored on magnetic tape may result in a
scrambled archive.  There is no safe way (except for completely
re-writing the archive) to delete files from a magnetic tape.
@end table

@node Options, FullDumps, Commands, Top
@chapter Options That Change How @code{tar} Works

Options may be specified as individual arguments starting with @samp{-}.
In this case, if the option wants an argument (as does, for example,
@samp{-f}) then the argument should come after the option, separated
from it by a space.
All options are optional.  Some options make sense with any command, while
others are meaningful only with particular commands.@refill

@menu
* General Options::		Options that are always meaningful.
* Creation Options::		Options for creating or updating an archive.
* Extraction Options::		Options for listing or extracting files.
* Option Syntax::		Old syntax for options
@end menu

@node General Options, Creation Options, Options, Options
@section Options That Are Always Meaningful

@table @code
@item -b @var{number}
This option is used to specify a @dfn{blocking factor} for the
archive.  When reading or writing the archive, @code{tar}, will do
reads and writes of the archive in blocks of @var{number}*512 bytes.

The default blocking factor is set when @code{tar} is compiled, and is
typically 20.

Blocking factors larger than 20 cannot be read by very old versions of
@code{tar}, or by some newer versions of @code{tar} running on old machines
with small address spaces.

With a magnetic tape, larger blocks give faster throughput and fit
more data on a tape (because there are fewer inter-record gaps).  If
the archive is in a disk file or a pipe, you may want to specify a
smaller blocking factor, since a large one will result in a large
number of null bytes at the end of the archive.

When writing cartridge or other streaming tapes, a much larger
blocking factor (say 126 or more) will greatly increase performance.
However, you must specify the same blocking factor when reading or
updating the archive.

With GNU @code{tar} the blocking factor is limited only by the maximum
block size of the device containing the archive, or by the amount of
available virtual memory.

@item -f @var{filename}
This option is used to specify the file name of the archive @code{tar}
works on.

If no @samp{-f} option is given, but the environment variable
@code{TAPE} exists, its value is used; otherwise, a default archive
name (which was picked when @code{tar} was compiled) is used.  The
default is normally set up to be the ``first'' tape drive or other
transportable I/O medium on the system.

If the filename is @samp{-}, @code{tar} reads the archive from
standard input (when listing or extracting), or writes it to standard
output (when creating).  If the @samp{-} filename is given when
updating an archive, @code{tar} will read the original archive from
its standard input, and will write the entire new archive to its
standard output.

If the filename contains @samp{:/dev/}, it is interpreted as
@samp{hostname:filename}.  If the @var{hostname} contains an ampersand
(@samp{@@}), it is treated as @samp{user@@hostname:filename}.
In either case, @code{tar} will invoke the command @code{rsh}
(or @code{remsh}) to start
up an @code{/etc/rmt} on the remote machine.  If you give an alternate login
name, it will be given to the @code{rsh}.  Naturally, the remote machine must
have a copy of @file{/etc/rmt}.  @code{/etc/rmt} is free software
from the University of California, and a copy of the source code can be found
with the sources for @code{tar}.  @code{/etc/rmt} will have to be modified to
run on non-BSD4.3 systems.@refill

@item -C @var{dir}
The @samp{-C} option causes @code{tar} to change into the
directory @var{dir} before continuing.  This option is usually
interspersed with the files @code{tar} is to work on.  For example,

@example 
tar -c iggy ziggy -C baz melvin
@end example

@noindent
will place the files @file{iggy} and @file{ziggy} from the current
directory on the tape, followed by the file @file{melvin} from the
directory @file{baz}.  This option is especially useful when you have
several widely separated files that you want to store in the same
directory in the archive.

Note that the file @file{melvin} is recorded in the archive under the
precise name @file{melvin}, @emph{not} @file{baz/melvin}.  Thus, the
archive will contain three files that all appear to have come from the
same directory; if the archive is extracted with plain @samp{tar -x},
all three files will be created in the current directory.

Contrast this with the command

@example
tar -c iggy ziggy bar/melvin
@end example

@noindent
which records the third file in the archive under the name @file{bar/melvin}
so that, if plain @samp{tar -x} is used, the third file will be created
in a subdirectory named @file{bar}.

@item -M
The @samp{-M} option causes @code{tar} to write a @dfn{multi-volume}
archive--one that may be larger than will fit on the medium used to
hold it.

When this option is used, @code{tar} will not abort when it cannot
read or write any more data.  Instead, it will ask you to prepare a
new volume.  If the archive is on a magnetic tape, you should change
tapes now; if the archive is on a floppy disk, you should change
disks, etc.

Each volume of a multi-volume archive is an independent tar archive,
complete in itself.  For example, you can list or extract any volume
alone (just don't specify @samp{-M}).  However, if one file in the
archive is split across volumes, the only way to extract it
successfully is with a multi-volume extract command (@samp{-xM})
starting on or before the volume where the file begins.

@item -N @var{date}
This option causes @code{tar} to only work on files whose modification
or inode-changed times are newer than the @var{date} given.  The main
use is for creating an archive; then only new files are written.  If
extracting, only newer files are extracted.

Remember that the entire date argument should be quoted if it contains
any spaces.

The date is parsed using @code{getdate}.

@item -R
If @samp{-R} is used, @code{tar} prints, along with every message it
would normally produce, the record number within the archive where
the message occurred.  This option is especially useful when reading
damaged archives, since it helps pinpoint the damaged sections.

This can also be useful when making a log of a file-system backup tape,
since the results allow you to find the file you want to retrieve
on several backup tapes and choose the tape where the file appears
earliest (closest to the front of the tape).

@item -T @var{filename}
Instead of taking the list of files to work on from the command
line, the list of files to work on is read from the file
@var{filename}.  If @var{filename} is given as @samp{-}, the list is
read from standard input.  Note that using both @samp{-T -} and
@samp{-f -} will not work unless you are using the @samp{-c} command.

@item -v
This option causes @code{tar} to be verbose about the actions it is
taking.

Normally, the @samp{-t} command to list an archive prints
just the file names (one per line) and the other commands are silent.

@samp{-tv} prints a full line of information about each file, like the
output of @samp{ls -l}.  @samp{-v} with any other command (aside from
@samp{-t}) prints just the name of each file operated on.

The output from @samp{-v} appears on the standard output except when
creating or updating an archive to the standard output, 
in which case the output from @samp{-v} is sent to the standard
error.

@item -version

This option causes @code{tar} to print out its version number to the
standard error.  In order to avoid being confused with the @samp{-v}
option, @samp{-version} must be given as a separate option,
preceded by a hyphen.

@item -w
This option causes @code{tar} to print a message for each action it
intends to take, and ask for confirmation on the terminal.  To
confirm, you must type a line of input.  If your input line begins
with @samp{y}, the action is performed, otherwise it is skipped.

The actions which require confirmation include adding a file to the
archive, extracting a file from the archive, deleting a file from the
archive, and deleting a file from disk.

If @code{tar} is reading the archive from the standard input,
@code{tar} will open the file @file{/dev/tty} to ask for
confirmation on.

@item -X @var{file}
This option causes @code{tar} to read a list of filenames
(actually regular expressions) from the file @var{file};
@code{tar} will ignore files with those names.
Thus if @code{tar} is called as @samp{tar -c -X foo .}
and the file @file{foo} contains @samp{*.o} none of the files
whose names end in @file{.o} in the current directory will be added
to the archive.  Multiple @code{-X} options may be given.

@item -z
@itemx -Z
The archive should be compressed as it is written, or decompressed
as it is read, using the @code{compress} program.  This option works
on physical devices (tape drives, etc.) and remote files as well as
on normal files; data to or from such devices or remote files is
reblocked by another copy of the @code{tar} program to enforce the
specified (or default) block size.  The default compression
parameters are used; if you need to override them, avoid the
@samp{-z} option and run @code{compress} explicitly.

If the @samp{-z} option is given twice, @code{tar} will pad the
archive out to the next block boundry (@pxref{General Options}).  This
may be useful with some devices that require that all write operations
be a multiple of a certain size.

Note that the @samp{-z} option will not work with the @samp{-M} option,
or with the @samp{-u}, @samp{-r}, @samp{-A}, or @samp{-D} commands.
@end table

@node Creation Options, Extraction Options, General Options, Options
@section Options for Creating Or Updating an Archive

These options are used to control which files @code{tar} puts in an
archive, or to control the format the archive is written in (@pxref{Format}).

Except as noted below, these options are useful with the @samp{-c},
@samp{-r}, @samp{-u}, @samp{-A}, and @samp{-D} commands.
Also note that the @samp{-B} option, (@pxref{Extraction Options}),
is also useful with the @samp{-r}, @samp{-u}, @samp{-A}, and @samp{-D} commands.

@table @code
@c this command no longer exists  -D is now the old -J command
@c @item -D
@c The @samp{-D} option tells @code{tar} to only store entries for the
@c directories it encounters, and to not to store the files inside the
@c directories.  In conjunction with @code{find} this is useful for
@c creating incremental dumps for archival backups, similar to those
@c produced by @code{dump}.

@item -G
This option should only be used when creating an incremental backup of
a filesystem.  When the @samp{-G} option is used, @code{tar} writes, at
the beginning of the archive, an entry for each of the directories that
will be operated on.  The entry for a directory includes a list of all
the files in the directory at the time the dump was done, and a flag
for each file indicating whether the file is going to be put in the
archive.  This information is used when doing a complete incremental
restore.

Note that this option causes @code{tar} to create a non-standard
archive that may not be readable by non-GNU versions of the @code{tar}
program.

@item -h
If @samp{-h} is used, when @code{tar} encounters a symbolic link, it
will archive the linked-to file, instead of simply recording the
presence of a symbolic link.  If the linked-to file is archived
again, an entire second copy of it will be archived, instead of a
link.  This could be considered a bug.

@item -l
This option causes @code{tar} to not cross filesystem boundaries
when archiving parts of a directory tree.  This option only
affects files that are archived because they are in a directory that
is archived; files named on the command line are archived
regardless, and they can be from various file systems.

This option is useful for making full or incremental archival backups of
a file system, as with the Unix @code{dump} command.

Files which are skipped due to this option are mentioned on the
standard error.

@item -o
This option causes @code{tar} to write an old format archive, which
does not include information about directories, pipes, fifos,
contiguous files, or device files, and specifies file ownership by
numeric user- and group-ids rather than by user and group names.  In
most cases, a @emph{new} format archive can be read by an @emph{old}
@code{tar} program without serious trouble, so this option should
seldom be needed.  When updating an archive, do not use @samp{-o}
unless the archive was created with the @samp{-o} option.

@item -V @var{name}
This option causes @code{tar} to write out a @dfn{volume header} at
the beginning of the archive.  If @samp{-M} is used, each volume of
the archive will have a volume header of @samp{@var{name} Volume @var{N}},
where @var{N} is 1 for the first volume, 2 for the next, and so on.

@item -W
This option causes @code{tar} to verify the archive after writing it.
Each volume is checked after it is written, and any discrepancies are
recorded on the standard error output.

Verification requires that the archive be on a back-space-able medium.
This means pipes, some cartridge tape drives, and some other devices
cannot be verified.
@end table

@node Extraction Options, Option Syntax, Creation Options, Options
@section Options for Listing Or Extracting Files

The options in this section are meaningful with the @samp{-x} command.
Unless otherwise stated, they are also meaningful with the @samp{-t}
command.

@table @code
@item -B
If @samp{-B} is used, @code{tar} will not panic if an attempt to
read a block from the archive does not return a full block.  Instead,
@code{tar} will keep reading until it has obtained a full block.

This option is turned on by default when @code{tar} is reading an
archive from standard input, or from a remote machine.  This is
because on BSD Unix systems a read of a pipe will return however much
happens to be in the pipe, even if it is less than @code{tar}
requested.  If this option was not used, @code{tar} would fail
as soon as it read an incomplete block from the pipe.

This option is also useful with the commands for updating an archive.

@item -G
The @samp{-G} option means the archive is an incremental backup.
Its meaning depends on the command that it modifies.

If the @samp{-G} option is used with @samp{-t}, @code{tar} will
list, for each directory in the archive, the list of files in that
directory at the time the archive was created.  This information is
put out in a format that is not easy for humans to read, but which
is unambiguous for a program: each filename is preceded by either a
@samp{Y} if the file is present in the archive, an @samp{N} if the
file is not included in the archive, or a @samp{D} if the file is a
directory (and is included in the archive).  Each filename is
terminated by a null character.  The last file is followed by an
additional null and a newline to indicate the end of the data.

If the @samp{-G} option is used with @samp{-x}, then when the entry
for a directory is found, all files that currently exist in that directory
but are not listed in the archive @emph{are deleted from the directory}.

This behavior is convenient when you are restoring a damaged file system
from a succession of incremental backups: it restores the entire state
of the file system to that which obtained when the backup was made.
If you don't use @samp{-G}, the file system will probably fill up
with files that shouldn't exist any more.

@item -i
The @samp{-i} option causes @code{tar} to ignore blocks of zeros in the
archive.  Normally a block of zeros indicates the end of the
archive, but when reading a damaged archive, or one which was created by
@code{cat}-ing several archives together, this option allows
@code{tar} to read the entire archive.  This option is not on by
default because many versions of @code{tar} write garbage after the
zeroed blocks.

Note that this option causes @code{tar} to read to the end of the
archive file, which may sometimes avoid problems when multiple files
are stored on a single physical tape.

@item -k
The @samp{-k} option prevents @code{tar} from over-writing existing
files with files with the same name from the archive.

The @samp{-k} option is meaningless with @samp{-t}.

@item -K @var{filename}
The @samp{-K} option causes @code{tar} to begin extracting or listing
the archive with the file @var{filename}, and to consider only the
files starting at that point in the archive.  This is useful if a
previous attempt to extract files failed when it reached
@var{filename} due to lack of free space.  (This assumes, of course,
that there is now free space, or that you are now extracting into a
different file system.)

@item -m
When this option is used, @code{tar} leaves the modification times of
the files it extracts as the time when the files were extracted,
instead of setting it to the time recorded in the archive.

The @samp{-m} option is meaningless with @samp{-t}.

@item -O
When this option is used, instead of creating the files
specified, @code{tar} writes the contents of the files
extracted to its standard output.  This may be useful if you
are only extracting the files in order to send them through a
pipe.

The @samp{-O} option is meaningless with @samp{-t}.

@item -p
This option causes @code{tar} to set the modes (access permissions) of
extracted files exactly as recorded in the archive.  If this option is
not used, the current @code{umask} setting limits the permissions on
extracted files.

The @samp{-p} option is meaningless with @samp{-t}.

@item -s
The @samp{-s} option tells @code{tar} that the list of filenames to be
listed or extracted is sorted in the same order as the files in the
archive.  This allows a large list of names to be used, even on a
small machine that would not otherwise be able to hold all the names
in memory at the same time.  Such a sorted list can easily be created
by running @samp{tar -t} on the archive and editing its output.

@samp{-s} is probably never needed on modern computer systems.
@end table

@node Option Syntax, , Extraction Options, Options
@section Old Syntax for Options

For compatibility with Unix @code{tar}, the first argument can contain
option letters in addition to the command letter; for example, @samp{tar
cv} specifies the option @samp{-v} in addition to the command @samp{-c}.
The first argument to GNU @code{tar} is always treated as command and
option letters even if it doesn't start with @samp{-}.

Some options need their own arguments; for example, @samp{-f} is followed
by the name of the archive file.  When the option is given separately, its
argument follows it, as is usual for Unix programs.  For example:

@example
tar -c -v -b 20 -f /dev/rmt0
@end example

When options that need arguments are given together with the command, all
the associated arguments follow, in the same order as the options.  Thus,
the example above could also be written in the old style as follows:

@example
tar cvbf 20 /dev/rmt0
@end example

@noindent
Here @samp{20} is the argument of @samp{-b} and @file{/dev/rmt0} is the
argument of @samp{-f}.

@node FullDumps, IncDumps, Options, Top
@chapter Using @code{tar} to Perform Full Dumps
Full dumps should only be made when no other people or programs are
modifying files in the filesystem.  If files are modified while
@code{tar} is making the backup, they may not be stored properly in
the archive, in which case you won't be able to restore them if you
have to.

You will want to use the @samp{-V} option to give the archive a
volume label, so you can tell what this archive is even if the label
falls off the tape, or anything like that.

Unless the filesystem you are dumping is guaranteed to fit on one
volume, you will need to use the @samp{-M} option.  Make sure you
have enough tapes on hand to complete the backup.

If you want to dump each filesystem separately you will need to use
the @samp{-l} option to prevent @code{tar} from crossing filesystem
boundaries when storing (sub)directories.

The @samp{-G} option is not needed, since this is a complete copy of
everything in the filesystem, and a full restore from this backup
would only be done onto a completely empty disk.

Unless you are in a hurry, and trust the @code{tar} program (and
your tapes), it is a good idea to use the @code{-W} (verify) option,
to make sure your files really made it onto the dump properly.  This
will also detect cases where the file was modified while (or just
after) it was being archived.

@node IncDumps, Problems, FullDumps, Top
@chapter Using @code{tar} to Perform Incremental Dumps
Performing incremental dumps is similar to performing full dumps,
although a few more options will usually be needed.

You will need to use the @samp{-N @var{date}} option to tell @code{tar} to
only store files that have been modified since @var{date}.
@var{date} should be the date and time of the last full/incremental
dump.

A standard scheme is to do a @samp{monthly} (full) dump once a month,
a @samp{weekly} dump once a week of everything since the last monthly and
a @samp{daily} every day of everything since the last (weekly or monthly)
dump.

Here is a copy of the script used to dump the filesystems of the
machines here at the Free Software Foundation.  This script is run
(semi-)automatically late at night when people are least likely to
be using the machines.  This script dumps several filesystems from
several machines at once (by using a network-filesystem).  The
operator is responsible for ensuring that all the machines will be
up at the time the dump happens.  If a machine is not running, its
files will not be dumped, and the next day's incremental dump will
@emph{not} store files that would have gone onto that dump.

@example
#!/bin/csh
# Dump thingie
set now = `date`
set then = `cat date.nfs.dump`
/u/hack/bin/tar -c -G -v\
 -f /dev/rtu20\
 -b 126\
 -N "$then"\
 -V "Dump from $then to $now"\
 /alpha-bits/gp\
 /gnu/hack\
 /hobbes/u\
 /spiff/u\
 /sugar-bombs/u
echo $now > date.nfs.dump
mt -f /dev/rtu20 rew
@end example

Output from this script is stored in a file, for the operator to
read later.

This script uses the file @file{date.nfs.dump} to store the date/time of
the last dump.

Since this is a streaming tape drive, no attempt to verify the
archive is done.  This is also why the high blocking factor (126) is
used.  The tape drive must also be rewound by the @code{mt} command
after the dump is made.

@node Problems, Rem Tape, IncDumps, Top
@chapter Common Problems Using @code{tar}

GNU @code{tar} will not allow you to create an archive that contains
absolute pathnames.  (An absolute pathname is one that begins with a
@samp{/}.) If you try, @code{tar} will automatically remove the
leading @samp{/} from the file names it stores in the archive.  It
will also type a warning message telling you what it is doing.

When reading an archive that was created with a different @code{tar}
program, GNU @code{tar} automatically extracts entries in the
archive which have absolute pathnames as if the pathnames were not
absolute.  If the archive contained a file @samp{/usr/bin/computoy},
GNU @code{tar} would extract the file to @samp{usr/bin/computoy} in
the current directory.  If you want to extract the files in an
archive to the same absolute names that they had when the archive
was created, you should do a @samp{cd /} before extracting the files
from the archive, or you should use the command
@samp{tar -C / @dots{}}.

In order to update an archive, @code{tar} must be able to backspace
the archive in order to re-read or re-write a block that was just read
(or written).  This is currently possible only on two kinds of
files:  normal disk files (or any other file that can be
backspaced with @code{lseek()}), and industry-standard 9-track magnetic
tape (or any other kind of tape that can be backspaced with
@code{ioctl(@dots{},MTIOCTOP,@dots{})}).

This means that the @samp{-r}, @samp{-u}, @samp{-A}, and @samp{-D}
commands will not work on any other kind of file.  Some media simply
cannot be backspaced, which means these commands and options will
never be able to work on them.  These non-backspacing media include
pipes and cartridge tape drives.

Some other media can be backspaced, and @code{tar} will work on them
once @code{tar} is modified to do so.

Archives created with the @samp{-M}, @samp{-V}, and @samp{-G}
options may not be readable by other version of @code{tar}.  In particular,
restoring a file that was split over a volume boundary will require
some careful work with @code{dd}, if it can be done at all.  Other versions
of @code{tar} may also create an empty file whose name is that of
the volume header.  Some versions of @code{tar} may create normal
files instead of directories archived with the @samp{-G} option.

@node Rem Tape, Format, Problems, Top
@chapter The Remote Tape Server
In order to access the tape drive on a remote machine, @code{tar}
uses the remote tape server written at the University of California
at Berkeley.  The remote tape server must be installed as
@file{/etc/rmt} on any machine whose tape drive you want to use.
@code{tar} calls @file{/etc/rmt} by running an @code{rsh} or
@code{remsh} to the remote machine, optionally using a different
login name if one is supplied.

A copy of the source for the remote tape server is provided.  It is
Copyright @copyright{} 1983 by the Regents of the University of California, but
can be freely distributed.  Instructions for compiling and
installing it are included in the @file{Makefile}.

The remote tape server may need to be modified in order to run on a
non-4.3BSD system.

@node Format, , Rem Tape, Top
@chapter The Format of a @code{tar} Archive
This chapter is based heavily on John Gilmore's @i{tar}(5) manual page
for the public domain @code{tar} that GNU @code{tar} is based on.

@section The Standard Format
A @dfn{tar tape} or file contains a series of records.  Each record
contains @code{RECORDSIZE} bytes.  Although this format may be
thought of as being on magnetic tape, other media are often used.

Each file archived is represented by a header record which describes
the file, followed by zero or more records which give the contents
of the file.  At the end of the archive file there may be a record
filled with binary zeros as an end-of-file marker.  A reasonable
system should write a record of zeros at the end, but must not
assume that such a record exists when reading an archive.

The records may be @dfn{blocked} for physical I/O operations.  Each
block of @var{N} records (where @var{N} is set by the @samp{-b}
option to @code{tar}) is written with a single @code{write()}
operation.  On magnetic tapes, the result of such a write is a
single tape record.  When writing an archive, the last block of
records should be written at the full size, with records after the
zero record containing all zeroes.  When reading an archive, a
reasonable system should properly handle an archive whose last block
is shorter than the rest, or which contains garbage records after a
zero record.

The header record is defined in C as follows:

@example
/*
 * Standard Archive Format - Standard TAR - USTAR
 */
#define  RECORDSIZE  512
#define  NAMSIZ      100
#define  TUNMLEN      32
#define  TGNMLEN      32

union record @{
    char        charptr[RECORDSIZE];
    struct header @{
        char    name[NAMSIZ];
        char    mode[8];
        char    uid[8];
        char    gid[8];
        char    size[12];
        char    mtime[12];
        char    chksum[8];
        char    linkflag;
        char    linkname[NAMSIZ];
        char    magic[8];
        char    uname[TUNMLEN];
        char    gname[TGNMLEN];
        char    devmajor[8];
        char    devminor[8];
    @} header;
@};

/* The checksum field is filled with this while the checksum is computed. */
#define    CHKBLANKS    "        "        /* 8 blanks, no null */

/* The magic field is filled with this if uname and gname are valid. */
#define    TMAGIC    "ustar  "        /* 7 chars and a null */

/* The magic field is filled with this if this is a GNU format dump entry */
#define    GNUMAGIC  "GNUtar "        /* 7 chars and a null */

/* The linkflag defines the type of file */
#define  LF_OLDNORMAL '\0'       /* Normal disk file, Unix compatible */
#define  LF_NORMAL    '0'        /* Normal disk file */
#define  LF_LINK      '1'        /* Link to previously dumped file */
#define  LF_SYMLINK   '2'        /* Symbolic link */
#define  LF_CHR       '3'        /* Character special file */
#define  LF_BLK       '4'        /* Block special file */
#define  LF_DIR       '5'        /* Directory */
#define  LF_FIFO      '6'        /* FIFO special file */
#define  LF_CONTIG    '7'        /* Contiguous file */

/* Further link types may be defined later. */

/* Bits used in the mode field - values in octal */
#define  TSUID    04000        /* Set UID on execution */
#define  TSGID    02000        /* Set GID on execution */
#define  TSVTX    01000        /* Save text (sticky bit) */

/* File permissions */
#define  TUREAD   00400        /* read by owner */
#define  TUWRITE  00200        /* write by owner */
#define  TUEXEC   00100        /* execute/search by owner */
#define  TGREAD   00040        /* read by group */
#define  TGWRITE  00020        /* write by group */
#define  TGEXEC   00010        /* execute/search by group */
#define  TOREAD   00004        /* read by other */
#define  TOWRITE  00002        /* write by other */
#define  TOEXEC   00001        /* execute/search by other */
@end example

All characters in header records are represented by using 8-bit
characters in the local variant of ASCII.  Each field within the
structure is contiguous; that is, there is no padding used within
the structure.  Each character on the archive medium is stored
contiguously.

Bytes representing the contents of files (after the header record of
each file) are not translated in any way and are not constrained to
represent characters in any character set.  The @code{tar} format
does not distinguish text files from binary files, and no
translation of file contents is performed.

The @code{name}, @code{linkname}, @code{magic}, @code{uname}, and
@code{gname} are null-terminated character strings.  All other
fileds are zero-filled octal numbers in ASCII.  Each numeric field
of width @var{w} contains @var{w} minus 2 digits, a space, and a null,
except @code{size}, and @code{mtime}, which do not contain the
trailing null.

The @code{name} field is the pathname of the file, with directory
names (if any) preceding the file name, separated by slashes.

The @code{mode} field provides nine bits specifying file permissions
and three bits to specify the Set UID, Set GID, and Save Text
(``stick'') modes.  Values for these bits are defined above.  When
special permissions are required to create a file with a given mode,
and the user restoring files from the archive does not hold such
permissions, the mode bit(s) specifying those special permissions
are ignored.  Modes which are not supported by the operating system
restoring files from the archive will be ignored.  Unsupported modes
should be faked up when creating or updating an archive; e.g. the
group permission could be copied from the @code{other} permission.

The @code{uid} and @code{gid} fields are the numeric user and group
ID of the file owners, respectively.  If the operating system does
not support numeric user or group IDs, these fields should be
ignored.

The @code{size} field is the size of the file in bytes; linked files
are archived with this field specified as zero.
@xref{Extraction Options}; in particular the @samp{-G} option.@refill

The @code{mtime} field is the modification time of the file at the
time it was archived.  It is the ASCII representation of the octal
value of the last time the file was modified, represented as an
integer number of seconds since January 1, 1970, 00:00 Coordinated
Universal Time.

The @code{chksum} field is the ASCII representation of the octal
value of the simple sum of all bytes in the header record.  Each
8-bit byte in the header is added to an unsigned integer,
initialized to zero, the precision of which shall be no less than
seventeen bits.  When calculating the checksum, the @code{chksum}
field is treated as if it were all blanks.

The @code{typeflag} field specifies the type of file archived.  If a
particular implementation does not recognize or permit the specified
type, the file will be extracted as if it were a regular file.  As
this action occurs, @code{tar} issues a warning to the standard
error.

@table @code
@item LF_NORMAL
@itemx LF_OLDNORMAL
These represent a regular file.  In order to be compatible with
older versions of @code{tar}, a @code{typeflag} value of
@code{LF_OLDNORMAL} should be silently recognized as a regular
file.  New archives should be created using @code{LF_NORMAL}.  Also,
for backward compatibility, @code{tar} treats a regular file whose
name ends with a slash as a directory.

@item LF_LINK
This represents a file linked to another file, of any type,
previously archived.  Such files are identified in Unix by each file
having the same device and inode number.  The linked-to
name is specified in the @code{linkname} field with a trailing null.

@item LF_SYMLINK
This represents a symbolic link to another file.  The linked-to
name is specified in the @code{linkname} field with a trailing null.

@item LF_CHR
@itemx LF_BLK
These represent character special files and block special files
respectively.  In this case the @code{devmajor} and @code{devminor}
fields will contain the major and minor device numbers
respectively.  Operating systems may map the device specifications
to their own local specification, or may ignore the entry.

@item LF_DIR
This specifies a directory or sub-directory.  The directory name in
the @code{name} field should end with a slash.  On systems where
disk allocation is performed on a directory basis the @code{size}
field will contain the maximum number of bytes (which may be rounded
to the nearest disk block allocation unit) which the directory may
hold.  A @code{size} field of zero indicates no such limiting.
Systems which do not support limiting in this manner should ignore
the @code{size} field.

@item LF_FIFO
This specifies a FIFO special file.  Note that the archiving of a
FIFO file archives the existence of this file and not its contents.

@item LF_CONTIG
This specifies a contiguous file, which is the same as a normal
file except that, in operating systems which support it,
all its space is allocated contiguously on the disk.  Operating
systems which do not allow contiguous allocation should silently treat
this type as a normal file.

@item 'A' @dots{}
@itemx 'Z'
These are reserved for custom implementations.  Some of these are
used in the GNU modified format, as described below.
@end table

Other values are reserved for specification in future revisions of
the P1003 standard, and should not be used by any @code{tar} program.

The @code{magic} field indicates that this archive was output in the
P1003 archive format.  If this field contains @code{TMAGIC}, the
@code{uname} and @code{gname} fields will contain the ASCII
representation of the owner and group of the file respectively.  If
found, the user and group ID represented by these names will be used
rather than the values within the @code{uid} and @code{gid} fields.

@section GNU Extensions to the Archive Format
The GNU format uses additional file types to describe new types of
files in an archive.  These are listed below.

@table @code
@item LF_DUMPDIR
@itemx 'D'
This represents a directory and a list of files created by the
@samp{-G} option.  The @code{size} field gives the total size of the
associated list of files.  Each filename is preceded by either a @code{'Y'}
(the file should be in this archive) or an @code{'N'} (The file is a
directory, or is not stored in the archive).  Each filename is
terminated by a null.  There is an additional null after the last
filename.

@item LF_MULTIVOL
@itemx 'M'
This represents a file continued from another volume of a
multi-volume archive created with the @samp{-M} option.  The original
type of the file is not given here.  The @code{size} field gives the
maximum size of this piece of the file (assuming the volume does not
end before the file is written out).  The @code{offset} field gives
the offset from the beginning of the file where this part of the
file begins.  Thus @code{size} plus @code{offset} should equal the
original size of the file.

@item LF_VOLHDR
@itemx 'V'
This file type is used to mark the volume header that was given with
the @samp{-V} option when the archive was created.  The @code{name}
field contains the @code{name} given after the @samp{-V} option.
The @code{size} field is zero.  Only the first file in each volume
of an archive should have this type.

@end table

You may have trouble reading a GNU format archive on a non-GNU system
if the options @samp{-G}, @samp{-M} or @samp{-V} were used when writing
the archive.
@unnumbered Concept Index
@printindex cp
@contents
@bye
DataMuseum.dk

DKUUG/EUUG Conference tapes

⟦e7463cca5⟧ TextFile

Derivation

TextFile