⟦fac708cd9⟧

TextFile

8DESIGN OF A PORTABLE FILE SYSTEM5

8NUCLEUS5

Peter Mikkelsen
Datalogisk Institut ved
Københavns Universitet
September 1980

A_B_S_T_R_A_C_T_: The report argues that the emergence of high-level sys-
_Æ_æ_å_ tem programming languages makes it an attractive and practicable
_?_>_å_ task to design a file system suited for use on even very diffe-
rent computers. Objectives for such a design are presented, and a
_Æ_æ_å_ file system nucleus is designed. This nucleus offers: hierarchi-
___å_ cal naming of files, simple authorization enforcement support,
_?_>_å_ extendable facilities of resource management and integrity pro-
tection. Only the most basic functions for retrieval and updating
of the actual contents of files are offered.

K_E_Y_W_O_R_D_S_: File system, file organization, naming of files, autho-
rization enforcement.

This report exists as an A/S Regnecentralen af 1979 report under
the registration RCSL No 31-D 608.
\f

T_A_B_L_E_ _O_F_ _C_O_N_T_E_N_T_S_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _P_A_G_E_

1. INTRODUCTION .......................................... 1
1.1 Background ....................................... 1
1.2 Scope of the Project ............................. 2
1.3 This Report ...................................... 4

2. MODELS FOR STORAGE AND RETRIEVAL OF BACKING STORAGE
DATA .................................................. 5
2.1 A Model for a Hierarchical File System ........... 6
2.1.1 Access Methods (AM) ....................... 8
2.1.2 Logical File System (LFS) ................. 8
2.1.3 Basic File System (BFS) ................... 8
2.1.4 File Organization Strategy Modules (FOSM) . 9
2.1.5 Device Strategy Modules (DSM) ............. 9
2.1.6 Input/Output Control System (IOCS) ........ 9
2.2 Functions of the File System: an Illustration .... 10

3. ENVIRONMENTAL CONSTRAINTS ON THE FILE SYSTEM .......... 12
3.1 Characteristics of Processor ..................... 12
3.1.1 Objectives ................................ 13
3.2 Characteristics of Backing Storage ............... 13
3.2.1 Objectives ................................ 14
3.3 Characteristics of Data .......................... 14
3.3.1 Objectives ................................ 18
3.4 Characteristics of Users ......................... 18
3.4.1 Objectives ................................ 19
3.5 Protection Against Errors ........................ 20
3.5.1 Objectives ................................ 21

4. DESIGN OF THE INDIVIDUAL LEVELS ....................... 22

5. BASIC FILE SYSTEM (BFS) ............................... 23
5.1 Objectives ....................................... 23
5.2 Design ........................................... 23
5.2.1 Organization of the Descriptors ........... 23 \f

T_A_B_L_E_ _O_F_ _C_O_N_T_E_N_T_S_ _(_c_o_n_t_i_n_u_e_d_)_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _P_A_G_E_

5.2.2 Redundancy Concerning Location of
Descriptors ............................... 27
5.2.3 Contents of Descriptors ................... 30
5.3 Functions of the BFS-Level ....................... 31
5.4 Concluding Remarks ............................... 32

6. FILE ORGANIZATION STRATEGY MODULE (FOSM) .............. 33
6.1 Objectives ....................................... 33
6.2 Design ........................................... 34
6.2.1 Page Organization of Files ................ 34
6.2.2 Organization of Index Tables .............. 37
6.2.2.1 Structure of the Individual
Indices .......................... 38
6.2.2.2 The Address Space of the
Individual Indices ............... 39
6.2.2.3 Representation of Index Tables in
Core ............................. 40
6.2.2.4 Single-Level or Multi-Level
Indices .......................... 43
6.2.3 Dynamics of Files ......................... 43
6.2.4 Unreadability of Deallocated Pages ........ 45
6.3 Functions of the FOSM-Level ...................... 46
6.4 Concluding Remarks ............................... 47

7. DEVICE STRATEGY MODULES (DSM) ......................... 48
7.1 Objectives ....................................... 48
7.2 Design ........................................... 49
7.2.1 Sequencing Strategies ..................... 49
7.2.2 Allocation/Deallocation of Pages,
Administration ............................ 49
7.2.3 Allocation/Deallocation of Pages,
Strategies ................................ 52
7.2.4 Marking of Malfunctioning Pages ........... 54
7.3 Functions of the DSM-Level ....................... 56
7.4 Concluding Remarks ............................... 56 \f

iii

T_A_B_L_E_ _O_F_ _C_O_N_T_E_N_T_S_ _(_c_o_n_t_i_n_u_e_d_)_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _P_A_G_E_

8. INPUT OUTPUT CONTROL SYSTEM (IOCS) .................... 57
8.1 Objectives ....................................... 57
8.2 Design ........................................... 57
8.3 Functions of the IOCS-Level ...................... 59
8.4 Concluding Remarks ............................... 59

9. ACCESS METHODS (AM) ................................... 60

10. LOGICAL FILE SYSTEM (LFS) ............................. 61
10.1 Objectives ....................................... 61
10.2 Design ........................................... 62
10.2.1 A Model for Naming ........................ 62
10.2.1.1 Special Objectives of Naming ..... 63
10.2.1.2 Names: An Abstract Model ......... 63
10.2.2 How Should Files be Named? ................ 66
10.2.2.1 Restrictions on the Digraph
Structure ........................ 66
10.2.2.2 The Function NAME ................ 69
10.2.2.3 The Function DEFINECONTEXT ....... 70
10.2.2.4 Summary of Naming Restrictions ... 70
10.2.3 Structure of the Catalogs ................. 70
10.2.4 Authorization Enforcement ................. 72
10.2.5 Deleting of Files ......................... 75
10.3 Basic Functions of the LFS ....................... 79
10.4 Concluding Remarks ............................... 81

11. REMARKS ON IMPLEMENTATION ............................. 82
11.1 Structuring the File System ...................... 82
11.2 Transparancy of Hierarchical Levels .............. 83

12. A PRELIMINARY REFERENCE MANUAL ........................ 86
12.1 Introduction ..................................... 86
12.2 Components of the Backing Storage ................ 86
12.2.1 Volumes ................................... 86
12.2.2 Pages ..................................... 87 \f

T_A_B_L_E_ _O_F_ _C_O_N_T_E_N_T_S_ _(_c_o_n_t_i_n_u_e_d_)_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _P_A_G_E_

12.2.3 Files ..................................... 87
12.2.4 Catalogs .................................. 89
12.2.5 Volumedescriptions ........................ 90
12.3 Functions of the File System ..................... 91
12.3.1 Volume Handling ........................... 91
12.3.2 Catalog Handling .......................... 91
12.3.3 File Handling ............................. 92
12.3.4 Authorization Enforcement ................. 94

13. CONCLUSION ............................................ 97

A_P_P_E_N_D_I_X_:

A. REFERENCES ............................................ 99

F_ 1_._ _ _ _ _ _ _ _ _I_N_T_R_O_D_U_C_T_I_O_N_ 1.

The file system is an essential part of the operating system of
any modern computer system.

The importance of a well-suited file system has been growing --
and will no doubt continue to do so in the future -- concurrently
with the need for, and development of still more sophisticated
and potent backing storage devices.

Distributed systems with configurations of several -- possibly
very different-sized -- computers, have become increasingly
prevalent. This has accentuated the need for compatible file
systems, allowing easy communication of data between installa-
tions -- for instance via exchange of backing storage media.

The emergence of high-level system programming languages like
Pascal (Jensen and Wirth 1975), Concurrent Pascal (Brinch Hansen
1977), Platon (Staunstrup et al. 1975), Beta (Kristensen et al.
1979), Modula (Wirth 1977), Ada (Wegner 1979), etc. has made it
an attractive and practicable task to design a file system suited
for use on even very different computer systems (yet with at
least one thing in common: access to a compiler for the high
level language in question).

Such a portable file system has the additional advantage of
reducing implementation expenses: it is less expensive to imple-
ment o_n_e_ file system which can be used on a number of computer
configurations, than implementing s_e_v_e_r_a_l_ individual file systems
for the different configurations.

To design such a file system (or at least the nucleus of such) is
the goal of this project.

1_._1_ _ _ _ _ _ _ _B_a_c_k_g_r_o_u_n_d_ 1.1

The work of the present report was initiated in april 1980 in
connection with a project at A/S Regnecentralen af 1979.
\f

The aim of that project was to design a file system for the RC700
microcomputer (based on the Zilog Z80 8-bit micro-processor). The
file system were to be implemented in the high-level language
Pascal80 (Staunstrup 1980), which is developed on basis of the
language Platon (Staunstrup et al. 1975).

It was soon decided that the use of Pascal80 for implementation
could be utilized, allowing the file system to be used on other
computer systems.

It was therefore necessary to reconsider the design of the file
system, as the premises were changed. The result of these re-con-
siderations is presented in this report.

1_._2_ _ _ _ _ _ _ _S_c_o_p_e_ _o_f_ _t_h_e_ _P_r_o_j_e_c_t_ 1.2

Design of a file system for a range of computers might seem an
impossible task to undertake. How could a single file system
possibly encompass environments, ranging from single-user
micro-computers with less than 64 Kb of core storage -- to a
large-scale multi-user, multi-processor computer equipped with
several kinds of backing storage devices?

Well, it might be infeasible to create a g_e_n_e_r_a_l_ _p_u_r_p_o_s_e_ file
system offering an assorted range of facilities, which would
encompass the full range of configuration possibilities.

That is the reason why the philosophy of this project is to con-
struct a n_u_c_l_e_u_s_ for such a file system.

That nucleus must offer a restricted -- but hopefully versatile
-- f_u_n_d_a_m_e_n_t_ for superstructuring levels of higher abstraction
and further facilities. The facilities may be elaborated diffe-
rently to any extent, on the various computers.

On the other hand, the nucleus must be able to perform as a file
system in itself -- without built-on extra facilities.
\f

The notable point is that the proposed hierarchical philosophy
ensures exchange of data between considerably differenced com-
puter systems, since the lowest levels -- the nucleus -- are the
same on any of the systems.

With the above described design philosophy in mind, the defini-
tions and scope of the file system presented in this report,
could be stated as follows:

- the file system should give access to the backing storage of
a computer on an abstract level of f_i_l_e_s_.

- a file is a named sequence of b_y_t_e_s_ presumably containing
d_a_t_a_. The file is split into a number of fixed size p_a_g_e_s_. A
file is described by, and identified through, various admin-
istrative datastructures, a.o. the file d_e_s_c_r_i_p_t_o_r_ and one
or more file n_a_m_e_s_ (the terms will be defined more precisely
at a later point).

- the file system offers only the most r_u_d_i_m_e_n_t_a_r_y_ _o_p_e_r_a_t_i_o_n_s_
to access and maintain the files. That is: it is not consi-
dered part of the file system nucleus to offer acces to var-
iable size records, as well as multibuffered i/o and simi-
lars are considered extensions which higher levels of an ac-
tual system should offer.

- s_p_e_c_i_a_l_i_z_e_d_ _a_c_c_e_s_s_ _m_e_t_h_o_d_s_ (e.g. sequential access, indexed
sequential access, hashed access, list-structured access
etc.) -- are not considered part of the file system nucleus.

- only r_a_n_d_o_m_ _a_c_c_e_s_s_ backing storage is considered. That is to
say that devices like magnetic tape, paper tape, punched
cards etc. is not considered in this report -- it is up to
higher levels to implement uniform access ("generalized file
system") even to such devices.
\f

- multi-user r_e_s_o_u_r_c_e_ _a_l_l_o_c_a_t_i_o_n_ _p_r_o_b_l_e_m_s_ and consequent
scheduling problems are not considered part of the file
system nucleus.

1_._3_ _ _ _ _ _ _ _T_h_i_s_ _R_e_p_o_r_t_ 1.3

The present report is written for persons who want a detailed
knowledge of the considerations which led to the proposed file
system design.

The reader is supposed to be familiar with ordinary system pro-
gramming techniques, and have a general knowledge of file systems
-- at least from the application programmer>s point of view.

Chapter 2 introduces a hierarchical model, modularizing the
structure of file systems. The model offers the conceptual basis
for the rest of the text.

Chapter 3 investigates the environmental constraints on the want-
ed file system; that is: it defines the demanded specifications
of the system. The specifications are formulated as a set of
objectives for the file system design.

The individual modules/levels -- as demarcated in chapter 2 --
are discussed and designed in the chapters 4 to 10. Chapter 10 --
the Logical File System -- will probably draw most attention,
since it is the decisions of this level that are of the most
immediate interest to users of the file system. The functions of
the Logical File System include the strategy for symbolic naming
of files, and authorization enforcement procedures.

Some remarks and recommendations for the actual implementation of
the proposed file system are given in chapter 11.

Finally -- before the concluding chapter 13 -- chapter 12 gives a
"Preliminary Reference Manual" for the file system. The chapter
presents an overview of the proposed file system.

F_ 2_._ _ _ _ _ _ _ _ _M_O_D_E_L_S_ _F_O_R_ _S_T_O_R_A_G_E_ _A_N_D_ _R_E_T_R_I_E_V_A_L_ _O_F_ _B_A_C_K_I_N_G_ _S_T_O_R_A_G_E_ _D_A_T_A_ 2.

A computer system consists of a number of hardware components
with associated software. The components may conveniently be
divided into two groups:

- central processing unit and core storage
- peripheral devices.

The first group -- together with its associated software -- is
supposed to manipulate data already present in the core storage.

To introduce (input) new data in the core storage, or communicate
data from the core storage to the external world (output), the
second group -- peripheral devices -- is used.

The software handling the transfer of data between the central
and peripheral devices is called the i_n_p_u_t_/_o_u_t_p_u_t_ _s_y_s_t_e_m_ (I/O
system).

As stated in section 1.2 we will focus on the special class of
peripheral devices known as "random access backing storage" de-
vices, in this report. The subsystem of an I/O system, handling
such devices is usually called the f_i_l_e_ _s_y_s_t_e_m_.

The backing storage may be of considerable dimensions, not pos-
sible to be encompassed in detail by the human mind. Therefore,
to facilitate transfer of data to and from backing storage, it is
necessary for the file system to maintain the administration of
an abstract model of the backing storage. This model must order,
group, relate, etc. the collection of data residing on backing
storage, so that it enables storage and retrieval of any data-
structure a given user may want to manipulate.

Different models to this end may be considered.

The classical approach is the notion of a file, consisting of a
number of consecutive records (possibly of variable length).
\f

But other models have been proposed and implemented (IBM 1978).
These models often have a close relationship to certain database
designs, or the like.

In this chapter we will look into only the classical approach. In
section 2.1 a hierarchical file system model is described. This
-- due to Madnick and Alsop (Madnick et al. 1969) -- has come to
be considered the standard of the field.

It would be an interesting point of further investigation to look
into alternative and more exotic models for the storage and re-
trieval of data on backing storage. However, it is not considered
within the scope of the present project to elaborate further on
this issue.

2_._1_ _ _ _ _ _ _ _A_ _M_o_d_e_l_ _f_o_r_ _a_ _H_i_e_r_a_r_c_h_i_c_a_l_ _F_i_l_e_ _S_y_s_t_e_m_ 2.1

The model presented in this section is primarily extracted from
the papers of Madnick (Madnick 1969a, Madnick et al. 1969b).

The reason for presenting the model is that it will furnish an
excellent framework for discussion and analysis of the different
parts of a file system. Furthermore, it will hopefully be a
guide-line for the implementor of the file system, since it sug-
gests a neatly structured and flexible implementation.

The above mentioned papers describe a file system design model
consisting of a hierarchy of six levels. Actual implementations
of file systems may not necessarily conform exactly with these
levels: levels may be combined, further subdivided, or even
absent (or at least reduced to a trivial form).
\f

The six hierarchical levels to be described below are:

- Access Methods (AM)
- Logical File System (LFS)
- Basic File System (BFS)
- File Organization Strategy Modules (FOSM)
- Device Strategy Modules (DSM)
- Input/Output Control System (IOCS)

(figur)

Figure 2.1: Hierarchical file system.

2_._1_._1_ _ _ _ _ _A_c_c_e_s_s_ _M_e_t_h_o_d_s_ _(_A_M_)_ 2.1.1

This level offers an interface to the file system. The interface
may include different file formatting functions: sequential
files, random access files, fixed-length record files, variable-
length record files, etc. It may include multiple-buffer facili-
ties. It may hide differing physical characteristics of different
backing storage devices.

Obviously, a user may supply his own access method functions aug-
menting the functions of this level.

2_._1_._2_ _ _ _ _ _L_o_g_i_c_a_l_ _F_i_l_e_ _S_y_s_t_e_m_ _(_L_F_S_)_ 2.1.2

This level transforms a user-supplied symbolic file name to an
internal unique "f_i_l_e_ _i_d_e_n_t_i_f_i_e_r_". The tables of transformation
data are usually called the c_a_t_a_l_o_g_(s). The symbolic filename
will never enter any of the levels below this level.

2_._1_._3_ _ _ _ _ _B_a_s_i_c_ _F_i_l_e_ _S_y_s_t_e_m_ _(_B_F_S_)_ 2.1.3

The Basic File System finds and maintains -- for a given file
identifier -- the corresponding "f_i_l_e_ _d_e_s_c_r_i_p_t_o_r_". The file de-
scriptor contains the "system-associated" data, such as "length",
"location", "reservation locks", etc.

The essential functions of "opening" and "closing" a file are
primarily performed in the BFS: when a file is opened, the BFS
selects a free FOSM (File Organization Strategy Module), which it
intializes with data from the file descriptor. Read and write
operations can hereafter be directed immediately to the FOSM.
"Close" releases the FOSM of the file.

The maintenance of file descriptors is the responsibility of the
BFS-level, too.

2_._1_._4_ _ _ _ _ _F_i_l_e_ _O_r_g_a_n_i_z_a_t_i_o_n_ _S_t_r_a_t_e_g_y_ _M_o_d_u_l_e_s_ _(_F_O_S_M_)_ 2.1.4

The user views a file as a collection of contiguous records. To
optimize storage device performances, this may not be true in
reality: the physical records of data may be scattered arbitrari-
ly on the backing storage.

When a FOSM receives an I/O-request concerning a given record
with an associated logical address (i.e. the logical position
within the file), it translates the logical address to an address
of a physical record, and forwards the request to the appropriate
Device Strategy Module.

2_._1_._5_ _ _ _ _ _D_e_v_i_c_e_ _S_t_r_a_t_e_g_y_ _M_o_d_u_l_e_s_ _(_D_S_M_)_ 2.1.5

Characteristics of access to records of different backing storage
devices vary much.

When several requests for access to records of a given device are
queued up, it is the responsibility of the Device Strategy Module
connected to the device, to produce an optimal sequence of re-
quests. The strategy of sequencing obviously depends on charac-
teristics such as latency and access time.

Furthermore, the DSM administers the allocation and deallocation
of records on the device. When a FOSM requests allocation of a
number of records, the DSM will select these.

2_._1_._6_ _ _ _ _ _I_n_p_u_t_/_O_u_t_p_u_t_ _C_o_n_t_r_o_l_ _S_y_s_t_e_m_ _(_I_O_C_S_)_ 2.1.6

The IOCS consists of the collection of basic drivers to the back-
ing storage devices. It is assumed that these drivers perform
functions such as: queuing of i/o-requests, automatic error reco-
very attempts in case of spurious errors, reasonable standard
actions in case of irrepairable errors etc.

2_._2_ _ _ _ _ _ _ _F_u_n_c_t_i_o_n_s_ _o_f_ _t_h_e_ _F_i_l_e_ _S_y_s_t_e_m_:_ _a_n_ _I_l_l_u_s_t_r_a_t_i_o_n_ 2.2

The hierarchy of the file system design model described above,
may be i_l_l_u_s_t_r_a_t_e_d_ by means of functions: each level corresponds
to a function defining the operations of the level.

If we disregard the AM-level -- which is irrelevant or trivial in
this connection -- we may define following abstract functions:

LFS: Name File-id
BFS: File-id Descriptor
FOSM: Descriptor Logical-pageno Logical-address
Device-id
DSM: Logical-address Device-id Physical-address
IOCS: Physical-address Data

The meaning of the defining sets are hopefully self-evident.

The basic functions of a file system are naturally "read" and
"write".

As an example we will use the function READ, defined as follows:

READ: Name Logical-pageno Data

buffer = READ (file, page)
buffer Data,
file Name,
page Logical-pageno
\f

In fact -- as one naturally must expect -- READ may be defined by
means of the functions of the file system:

READ = LFS BFS FOSM DSM IOCS
or

(1) READ (file, page)
= IOCS (DSM (FOSM (BFS (LFS (file)), page)))

This formula illustrates the usefulness of a function OPEN:

If a sequence of READ>s is performed on the one and same file,
the variable "file" will be constant. From formula (1) we get:

(2) READ (file, page) = IOCS (DSM (FOSM (k, page)))

where

k = BFS (LFS (file)),

The function (2) obviously involves less computation than (1), as
the two functions BFS and LFS are eliminated.

Consequently, we may define a function OPEN:

OPEN: Name Descriptor
OPEN = LFS BFS

which for a sequence of READ-operations is constant, and thus
only needs computation once.

F_ 3_._ _ _ _ _ _ _ _ _E_N_V_I_R_O_N_M_E_N_T_A_L_ _C_O_N_S_T_R_A_I_N_T_S_ _O_N_ _T_H_E_ _F_I_L_E_ _S_Y_S_T_E_M_ 3.

In this chapter the environmental constraints on the file system
will be investigated. The environment will be treated in four:

characteristics of
- processor (section 3.1)
- backing storage (section 3.2)
- data -- contents of files (section 3.3)
- users (section 3.4)

As the survival of stored data is of vital importance in a com-
puter system, it is necessary to consider the protection of these
data. This will be done in section 3.5.

In each of these five subsections, the considerations of
constraints will be expressed in a set of objectives for the
design of the file system.

These objectives -- and deduced further points -- are the basis
for the detailed discussion of file system design in chapters 5
to 10.

3_._1_ _ _ _ _ _ _ _C_h_a_r_a_c_t_e_r_i_s_t_i_c_s_ _o_f_ _P_r_o_c_e_s_s_o_r_ 3.1

The file system is to be used on microcomputers as well as large-
scale mainframe computers. This leads to considerations on core-
size, speed of processing and so on. It is a necessity that the
file system is able to perform well, both on a small (core-) size
and slow microcomputer as well as on a larger and faster com-
puter. To perform well -- relative to the overall performance of
the computer -- this probably means that the core resident part
of the file system should be able to vary in size (controlled by
initialization and configuration parameters).

3_._1_._1_ _ _ _ _ _O_b_j_e_c_t_i_v_e_s_ 3.1.1

- The file system (implementation) should be structured so that
its core storage demands are controlled by initialization/con-
figuration parameters.

3_._2_ _ _ _ _ _ _ _C_h_a_r_a_c_t_e_r_i_s_t_i_c_s_ _o_f_ _B_a_c_k_i_n_g_ _S_t_o_r_a_g_e_ 3.2

The file system should -- ideally -- be for use on both slow
(floppy-disks), medium fast (hard discs, drums, magnetic bubble
memories), and very fast (semiconductor or core memory) media.

In practice, this is hard to accomplish. Therefore some trade-
offs must be made. In keeping with the current trends we will
want to focus our attention on the disktype devices, primarily
hard disk types.

Magnetic bubble memory is not yet a serious alternative to hard
disks. But prices will decrease dramatically within the next few
years. Fortunately -- from the viewpoint of this design -- the
access time characteristics of magnetic bubble memory resemble
those of ordinary hard disks -- at least in principle: access
time is divided in seek and latency time. Hence, hard disk access
strategies are not necessarily irrelevant in connection with mag-
netic bubble memory. It might be said that the fact that magnetic
bubble memories are an order of magnitude faster than hard disks
is just an added convenience.

Figure 3.1: Backing storage characteristics, from (Hnatek 1979).
\f

Thus the prime problem is to balance the factors of slow floppy-
disk-like devices against those of medium-fast devices.

3_._2_._1_ _ _ _ _ _O_b_j_e_c_t_i_v_e_s_ 3.2.1

- the number of (random) accesses to the backing storage must
be minimal for often used facilities of the file system.

- organization of logically connected parts of backing storage
(e.g. sequential files) should be optimized relative to the
physical characteristics of different media. The file system
must consequently be able to encompass resulting d_i_f_f_e_r_e_n_t_
strategies of allocation and device handling.

3_._3_ _ _ _ _ _ _ _C_h_a_r_a_c_t_e_r_i_s_t_i_c_s_ _o_f_ _D_a_t_a_ 3.3

The file system must be able to handle few as well as a great
number of files. Also it must be able to handle small as well as
huge files.

Actual computer installations have been investigated with respect
to the size of files.

The tables of fig. 3.2, 3.3, and 3.4 show the distribution of
file sizes in some typical computer-environments. It is clearly
shown that the majority of files are relative small (less than 10
Kb).

Data may be accessed in a wide varity of ways. Therefore diffe-
rent overlays of internal structures on the files (random access
organizations, list- and chain organizations etc.) should be well
supported by the file system.
\f

Figure 3.2: File sizes.
The installation is a rather big minicomputer used
for system development at A/S Regnecentralen af 1979.
\f

Figure 3.3: File sizes.
The installation is a medium-sized minicomputer used
for system development at A/S Regnecentralen af 1979.
\f

Figure 3.4: File sizes.
The installation is a very big minicomputer system
used for both batch and on-line applications. The
system is situated at a customer of A/S Regnecentra-
len af 1979.
\f

Many computer installations comprise a limited number of devices
for backing storage. Typically there may be a few disk drives,
but a lot of volumes (kits) which may be mounted on these drives.
This situation is obviously caused by the relative high expence
of backing storage devices (compared to the expence of the rest
of the computer system).

Consequently, the files may be situated on a lot of volumes, and
all the volumes of the system is not necessarily mounted at the
same time.

3_._3_._1_ _ _ _ _ _O_b_j_e_c_t_i_v_e_s_ 3.3.1

- The file system must be able to handle few as well as a
great number of files.

- must be able to handle small as well as huge files.
Especially small files must be handled efficiently.

- the files may be located on a lot of volumes, not necessa-
rily mounted concurrently (at the same time).

- the files must be fully dynamic. Single files must be able
to grow or shrink (at least at the tail of the file, but
preferably anywhere: beginning, midst, or tail of the file).

3_._4_ _ _ _ _ _ _ _C_h_a_r_a_c_t_e_r_i_s_t_i_c_s_ _o_f_ _U_s_e_r_s_ 3.4

The file system may be situated in the context of single-user as
well as multi-user systems. The file system must therefore be
able to handle situations where a number of users access files on
shared volumes of backing storage. This demands certain facili-
ties of synchronization (exclusive and shared access etc.) and
resource management.
\f

The multi-user situation also calls for more sophisticated naming
facilities: users should be able to use filenames independent on
other users choise of names, and files should be protected
against accidental or malicious misuse by other users.

Whatever protection rules are enforced, the environment is prone
to update already existing data with false new data from time to
time. Therefore it may be a facility of the file system automati-
cally to dump (updated) data (backup). Admittedly, such facili-
ties are on the borderline of what a file system -- in itself --
should be able to cope with.

3_._4_._1_ _ _ _ _ _O_b_j_e_c_t_i_v_e_s_ 3.4.1

- Users must be able to choose filenames independently, with-
out concern for duplicate filenames (otherwise they will
have to guess at names not previously chosen).

- it must be possible to specify accessrights allowing private
as well as public use, so that a number of users may share
the same file.

- synchronization of concurrent use of single files must be
implemented.

- the file system must deliver such relevant data, that
implementation of an augmenting resource management system
is possible.

S_e_c_o_n_d_a_r_y_ _o_b_j_e_c_t_i_v_e_s_:

- automatic backup procedures should be supported by the file
system.

3_._5_ _ _ _ _ _ _ _P_r_o_t_e_c_t_i_o_n_ 3.5

In any computer system, data are in danger of being improperly
e_x_p_o_s_e_d_ or d_a_m_a_g_e_d_ (Lampson 1974). Protections are a defense
against such threats.

Threats may be caused by a_c_c_i_d_e_n_t_ or m_a_l_i_c_e_.

Malice tries to make use of any weakness in a protection system.
Malice may have much more far-reaching consequences than acci-
dents (unobtrusive modification is much worse than outright de-
struction). Thus, protection against malice is harder to estab-
lish and at the same time more important.

Accidents may damage data. Of course it may expose data as well,
but this is presumably less harmful if no malice is present. Ac-
cidents may be caused by user mistakes, software errors in system
programs (bugs) or hardware malfunctioning.

Considering tolerancy against faults, there are two important
factors:

- r_e_d_u_n_d_a_n_c_y_: Errors are only detected through the presense of
redundancy. If enough redundancy, faults may even be re-
paired. So the amount of redundancy in the different organ-
izations of the file system is an important design parame-
ter.

- p_h_y_s_i_c_a_l_ _l_o_c_a_t_i_o_n_ _o_f_ _f_a_u_l_t_ _s_e_n_s_i_t_i_v_e_ _d_a_t_a_: Errors are prone
to hit backing storage "clustered" around a single physical
location (e.g. disrupting one or a few adjacent sectors of a
disk). Therefore sensitive data should be d_i_s_t_r_i_b_u_t_e_d_, li-
miting the effect of local errors: one disrupted sector of a
disk should not cause total disruption of the disk ("casca-
ding" of errors).

3_._5_._1_ _ _ _ _ _O_b_j_e_c_t_i_v_e_s_ 3.5.1

- The administrative data of the file system are sensitive
data, and should consequently be fault tolerant. This
includes:
1) least possible "cascading" of minor errors.
2) regeneration of disrupted areas on basis of intact
areas ("scavenging").

- authorization enforcement should be supported.

- reservation and marking of malfunctioning areas (e.g. areas
with persistent parity errors).

F_ 4_._ _ _ _ _ _ _ _ _D_E_S_I_G_N_ _O_F_ _T_H_E_ _I_N_D_I_V_I_D_U_A_L_ _L_E_V_E_L_S_ 4.

In the following chapters 5 to 10, the design of the individual
levels of the model introduced in chapter 2 will be discussed.

The discussion of the Access Methods (AM) level and the Logical
File System (LFS) level will be postponed until after discussion
of Basic File System (BFS) and lower levels. The latter levels
must -- to a certain degree -- be independent of the type of en-
vironment, since files generated in one milieu must be compatible
with versions of the file system in another milieu.

Each of the chapters 5 to 10 is divided in five:

- short general description of the level in question
- objectives
- design
- interface functions to the level
- concluding remarks.

The d_e_s_i_g_n_-sections consist of an iterative sequence of p_r_o_p_o_s_a_l_s_
for solution of a stated subproblem. Each proposal is clearly
emphasized in the text.

Examples of actual systems employing the proposal in question, is
given immediately below each proposal. Following references to
actual systems are used:

Burroughs B5500 MCS (Ravn 1979)
IBM OS/360 (Kristensen 1979)
OS6 (Stoy and Strachey 1972)
PDP SOLO (Brinch Hansen 1977)
RC3600 (Olsen 1978)
RC System 2 (Brinch Hansen 1971)
RC System 3 (Aris et al. 1978)
TITAN (Kristensen 1979).

F_5_._ _ _ _ _ _ _ _ _B_A_S_I_C_ _F_I_L_E_ _S_Y_S_T_E_M_ _(_B_F_S_)_ 5.

This module is concerned with maintenance of the file d_e_s_c_r_i_p_t_-
o_r_s_. The desriptor provides the overall information about the
file, such as "length" and "location".

Data needed for controlling multiaccess to the file (shared and
exclusive accessing - prevention of deadlocks, etc.) is also
located in the file descriptor.

5_._1_ _ _ _ _ _ _ _O_b_j_e_c_t_i_v_e_s_ 5.1

- Must give all relevant information on the overall status of the
files

- the descriptors are essential for accessing of files. They must
be fault tolerant.

5_._2_ _ _ _ _ _ _ _D_e_s_i_g_n_ 5.2

The main problems of the BFS-level are: (1) the organization of
descriptors, (2) redundancy in locating descriptors, and (3) the
actual contents of a descriptor.

5_._2_._1_ _ _ _ _ _O_r_g_a_n_i_z_a_t_i_o_n_ _o_f_ _t_h_e_ _D_e_s_c_r_i_p_t_o_r_s_ 5.2.1

The simplest scheme for organization of the file descriptors is
to associate them directly with the symbolic file names.

P_r_o_p_o_s_a_l_: File descriptors are put in the catalog(s) together
with the symbolic filenames. Thus there are one de-
scriptor for each name.
(RC system 3, PDP SOLO, IBM OS/360).
\f

This organization scheme -- coupled with a reasonable organiza-
tion of the catalog(s) (e.g. hashed) -- ensures relative quick
retrieval of all relevant information on a single file. On the
other hand search lengths may grow fast, as the size of single
entries are rather big (there can only be few entries per page),
so the fillgrade must be kept low.

A more serious problem is that of multiple naming of single phy-
sical files. If users are to have individual and different names
for a file, this implies that they have individual descriptors
for the file. This leads to inconsistency problems among the de-
scriptors, as the information contained in a descriptor, are at-
tributes to the d_a_t_a_ of the file (and not to the n_a_m_e_s_ of the
file), and consequently should be the same for every name-entry.

P_r_o_p_o_s_a_l_: There is a unique file descriptor for every file. The
descriptors are put in a special system file. Conse-
quently a file has a unique internal name (called the
f_i_l_e_ _i_d_e_n_t_i_f_i_e_r_) which is defined by the index of the
descriptor in the file of descriptors.
(OS6, Burroughs B5500 MCS)

In this case the catalog system maps the symbolic names into
identifiers.

No inconsistency problem exists, as descriptors are unique.

Searching for a given name in a catalog can be fast, because many
entries may be found on the same page (size of entries is small,
consisting only of a symbolic name and an identifier).

On the other hand it is slower to access the data of a given
file, because it is necessary to access the catalog as well as
the descriptor to find the physical address of the first file-
page.
\f

Of course the slowdown would not be significant if the whole de-
scriptorfile was held in core. This is not realistic as it would
mean that filedescriptors for every accessible file in the system
should exist in core, introducing severe restrictions on the
number of files. Such a limitation would be antagonistic to the
stated objectives of the file system.

Another problem is the vulnerability of a descriptorfile: if the
file is erroneously deleted or disrupted the whole associated
volume of backing storage will become inaccessible. If the pages
of the file are clustered physically on the backing storage
(which normally is optimal for allocation of files) a hardware
error disrupting even a minor part of the backing storage (e.g. a
few sectors of a disk) may render the access to several files
impossible.

P_r_o_p_o_s_a_l_: File descriptors are part of the individual files (e.g.
placed on the first page of any file). A file is unam-
biguously identified (within a volume) by the address
of its first page.

This proposal combines the best features of the two former pro-
posals:

- the catalog system is totally independent on the Basic File
System, because files can be identified uniquely by internal
names (it is assumed that the address of a given file>s first
page is fixed through the whole lifetime of the file). Alas, no
inconsistency problem!

- no superfluous backing storage accesses in connection with
"opening" of a file are needed, as the first page of the file
has to be read anyway (this page contains information about the
location of the single pages of the file, as shall be seen in a
following chapter.

- descriptors are automatically distributed on the volume, there-
by diminishing the risk of catastrophes when single pages de-
velop hardware faults.
\f

Unfortunately, the assumption of fixed first pages of all files,
is very inconvenient when one considers reorganization. Reorgani-
zation obviously involves moving of files, and it is difficult --
if not impossible -- to find a reasonable procedure for reorga-
nizing the files of a volume, and at the same time updating the
references to first pages (file identifiers) in all of the cata-
logs (no backward references from files to catalogs exist).

P_r_o_p_o_s_a_l_: File descriptors are placed at the first page of any
file. Addresses of such first pages ("descriptor
pages") are put in a table, the "File Location Table",
contained in a special system file. A file is identifi-
ed unambiguously (within a volume), by an index to the
table-entry referencing the first page of the file.

This proposal resembles the last-but-one proposal. It ensures the
independence of file identifiers and file location thus enabling
efficient reorganization.

Unfortunately it suffers from some of the same shortcomings as
the last but one proposal:

- opening of a file requires one extra access to find the
address of the descriptor page in the proposed File
Location Table.

- the File Location Table is very vulnerable, since the
data are clustered.

As regards the first shortcoming: this is the obvious effect of
creating independence between file identifiers and file locations
(in conjunction with the demand of no backward references from
files to their file names). Therefore this must regrettedly be
accepted, in the good cause of flexibility and reorganization.

Remark: Unlike the previous proposal a lot of table entries may
be placed on a single table page (the entries contain only an
address). If time used for opening of files is critical, it may
be practicable, to let the whole (or a large part of the) table\f

be resident in core. This should naturally be controlled by an
initialization parameter to the file system.

As regards the second shortcoming (vulnerability of table), this
may be eliminated by the following scheme:

- the File Location Table index of a file is stored in the
descriptor when created (i.e. "pushed down" into the
descriptor). Descriptors are distributed and therefore
fault tolerant, seen as a whole.

- the recovery program finds all the descriptors of the
volume, thereby reconstructing the File Location Table.

This leaves the problem of finding every descriptor on a volume
without use of the File Location Table. That is the subject of
the next subsection.

5_._2_._2_ _ _ _ _ _R_e_d_u_n_d_a_n_c_y_ _C_o_n_c_e_r_n_i_n_g_ _L_o_c_a_t_i_o_n_ _o_f_ _D_e_s_c_r_i_p_t_o_r_s_ 5.2.2

As pointed out in the previous subsection, it is essential for
recovery after faults on the File Location Table, that the loca-
tion of all file descriptors may be found, without use of the
File Location Table.

There are two different approaches to this:

(1) A separate table, referencing all descriptors, exists.

This could for instance be a bit table, referencing all the
pages on which descriptors reside. Or it could simply be a
copy of the File Location Table.
Use of this method implies that the table must be maintained
in parallel with the File Location Table.

(2) The descriptor pages are marked in such a way that they can
be identified by a recovery program ("scavenger"), scanning
all the pages of the volume (e.g. Redell 1980).
\f

It does seem impractical to mark every page of a volume,
this requiring a little of the data area of every page. The
problem must be solved otherwise: The descriptor pages con-
tain a special identification in the first few words. The
chances of duplicating such an identification -- by acci-
dent, or by maliciousness -- on a non-descriptor page, could
be kept extremely low, by using a well-chosen and suffici-
ently long identification.

Since method (1) adds overhead to normal system operation, method
(2) will be adapted in the final design.

The problem of choosing a reasonable identification for
descriptor pages is rather dependent on the actual implementation
of the file system. The subject will therefore not be elaborated
in detail here, but a few hints will be given:

- the identification should not contain ASCII-characters
(i.e. no low-valued bytes), since they are common in
datafiles.

- there must be no valid identification on unallocated
pages: when a file is deleted, the identification on its
descriptor page must be marked or destroyed.

- the length must be chosen so that the probability of
interpreting a wrong page as a descriptor page is
extremely low. An example:

Define p(xDd1Uu xDd2Uu ... xDdnUu) as the probability of meeting the
byte-pattern xDd1Uu, xDd2Uu, ... xDdnUu (xDdiUu is a byte-value) in the
first n bytes of any page on a given volume.

The problem is to define an identification xDd1Uu, xDd2Uu ... xDdnUu,
ensuring a sufficiently low p(xDd1UuxDd2Uu ... xDdnUu).
\f

If we -- for the sake of simplicity -- assume that all
values of any byte on a data-page are equally probable we
obtain following:

(1) all byte values are independent, i.e.:
p(xDd1UuxDd2Uu ... xDdnUu) = p(xDd1Uu) p(xDd2Uu) ... p(xDdnUu)

(2) for any two byte-values xDd1Uu and xDd2Uu
M_M_ _1_ _
p(xDd1Uu) = p(x) = 2Uu-8Dd =
P_P_ 256

(3) p(xDd1UuxDd2Uu ... xDdnUu) = p(xDd1Uu) p(xDd2Uu) ... p(xDdnUu)
= (p(xDd1Uu))UunDd
= 2Uu-8nDd

If we choose n = 4, then

M_M_ _1_ _
p(idDd1Uu idDd2Uu idDd3Uu idDd4Uu) = 2Uu-32Dd =
P_P_ 2 G.

This means that the probability of misinterpreting one or
more pages of a 250 Mb-disk (which contains approximately
500 K pages), is:

M_M_

P_P_

If we choose n = 6 we obtain analogously:

M_M_ P = 2Uu-48Dd 500K = 0.2 10Uu-6Dd%
P_P_ 6

The above considerations serve to show that it is feasible
to reconstruct the File Location Table by scanning all the
pages of a backing storage volume -- if the descriptor
page identification is carefully chosen.

5_._2_._3_ _ _ _ _ _C_o_n_t_e_n_t_s_ _o_f_ _D_e_s_c_r_i_p_t_o_r_s_ 5.2.3

The descriptor contains attributes to the d_a_t_a_ of the file, such
as l_e_n_g_t_h_ and l_o_c_a_t_i_o_n_ of the file. Please note, that another set
of attributes must be associated with the n_a_m_e_s_ of the file,
defining use- and access-rights (for authorization enforcement).

Superimposing levels to the file system may want to insert their
own attributes in the descriptor. This may be done in a special
user-accessible area of the descriptor -- the t_a_i_l_.

The descriptor could furthermore contain information for con-
trolling of multiaccess, such as:

- open _count (number of times the file has been "opened"
without later "closing")

- update _count (number of current open connections in
update-mode)

- retrieve _count (number of current open connections in
retrieve _mode)

- exclusive _reserved (a flag: if set by a user opening the
file, all other users are excluded from access to the file
-- this may for instance be used in connection with
writeaccess),

and some sort of information concerning the last updating of the
file (for backup purposes), e.g.:

- time and date of last update.

5_._3_ _ _ _ _ _ _ _F_u_n_c_t_i_o_n_s_ _o_f_ _t_h_e_ _B_F_S_-_L_e_v_e_l_ 5.3

A reasonable set of functions, accessing the actual contents of
files and manipulating descriptors, is:

open (file _identifier, mode)
close (file _identifier)

get (file _identifier, buffer, page _no, page _count)
put (file _identifier, buffer, page _no, page _count)

create (descriptor _mask, file _identifier)
delete (file _identifier);

extend (file identifier, amount)
truncate (file _identifier, amount)

settail(file _identifier, tail)

mount (volumename, device)
dismount (device)

The f_i_l_e_ _i_d_e_n_t_i_f_i_e_r_ is the internal name of the file as given by
the catalog system. M_o_d_e_ indicates write- or read-access or both.
Furthermore it specifies private (exclusive) or shared access.
P_a_g_e_ _n_o_ is the logical number of the first page to be read/writ-
ten. P_a_g_e_ _c_o_u_n_t_ denotes the number of pages from (or to) b_u_f_f_e_r_,
to be transferred. D_e_s_c_r_i_p_t_o_r_ _m_a_s_k_ contains relevant data (attri-
butes) for creation of a new file. A_m_o_u_n_t_ is the number of pages
to allocate/deallocate. T_a_i_l_ contains the data to be put in the
tail of the descriptor.

It should be noted that the format of get- and put- functions
facilitates multiple page transfers to and from backing storage.
The interface would obviously be simpler if only offering single-
page transfers: get/put (file _identifier, buffer, page _no). How-
ever, multiple-page transfer is, in many cases, much more
efficient: All the pages of a program may be loaded by issue of a
single read-command to the controller of the backing storage (on\f

a disk all the pages of a single surface of a cylinder may read
in one revolution). This will be discussed in detail later.

Mount and dismount are functions to manage the including and
removal of backing storage volumes. V_o_l_u_m_e_n_a_m_e_ is a userdefined
symbolic name of the volume to be mounted. A simple check is
performed to ensure the correct identity of the mounted volume.
D_e_v_i_c_e_ is an identification of the device on which the volume
is, or shall be, mounted.

5_._4_ _ _ _ _ _ _ _C_o_n_c_l_u_d_i_n_g_ _R_e_m_a_r_k_s_ 5.4

All the stated objectives of section 5.1 are reasonably met.

F_6_._ _ _ _ _ _ _ _ _F_I_L_E_ _O_R_G_A_N_I_Z_A_T_I_O_N_ _S_T_R_A_T_E_G_Y_ _M_O_D_U_L_E_ _(_F_O_S_M_)_ 6.

This level defines the internal organization of a single file. It
maintains this information in case of creation, deletion,
extension, or "shrinking" of a file.

6_._1_ _ _ _ _ _ _ _O_b_j_e_c_t_i_v_e_s_ 6.1

Prime objectives:

- dynamics: a file should be able to grow or shrink arbitrarily,
at least at the tail of the file

- the amount of allocated or deallocated pages must be a figure
available to the user (for resource management purposes)

- the time for access of single pages of the file must be pre-
dictable within reasonable limits

- sequential access (e.g. loading of programs) must be fast

- random access must be reasonably fast

- file allocation (and deletion) must be reasonably fast

- when a page is deallocated (e.g. by deletion of a file), the
data contained herein should never be seen by another user
incidently allocating the same page later.
Admittedly such recreation of -- possibly sensitive -- data is
very random and data can furthermore always be effectively
deleted by the user -- if so wanted -- by overwriting them
before deallocation of the corresponding pages.
However, in keeping with Murphy>s laws: if not forced by the
system, users will not act reasonable!
\f

Secondary objectives:

- a file should be able to grow or shrink anywhere: beginning,
midst, or tail

- files may be arbitrarily big, encompassing more volumes of
backing storage.

6_._2_ _ _ _ _ _ _ _D_e_s_i_g_n_ 6.2

The design of the FOSM-level is treated in four subsections:

- page organization of files
- organization of index tables
- dynamics of files
- unreadability of deallocated pages

6_._2_._1_ _ _ _ _ _P_a_g_e_ _O_r_g_a_n_i_z_a_t_i_o_n_ _o_f_ _F_i_l_e_s_ 6.2.1

The three wellknown techniques of file organization are:

- contiguous organization
- chained organization
- index table organization.

P_r_o_p_o_s_a_l_: Files are organized contiguously.
(RC System 2).

This is a simple strategy: the file is identified by the address
of its first page. The address of any page in the file is easily
found by a simple computation. Many files can be open at the same
time, as organization overhead in terms of core storage is low.
Sequential accessing time is minimal (for the sake of simplicity
it will be assumed in the following that latency time will be
minimal when accessing sequential pages of the backing storage.
This assumption may in fact not be true for all devices, but a
cunning mapping from the logical pagenumbers of the device, into\f

actual physical locations, may prove sufficient to fulfil the
assumption as shall be seen later -- cf. subsection 7.2.3).

Remark: The postulate that contiguous files minimize sequential
access time is true only for single-user systems. As Madnick
(Madnick 1969a) points out, the question is open to dispute when
two sequential files are scanned concurrently on the same volume,
since this will defeat the assumed sequential accessing!

The notable drawback of the proposal is that file allocation (or
deletion) is unacceptably slow: files must be compacted by copy-
ing from time to time. Otherwise waste of space will be excessi-
ve, since deleted files will create large holes of potentially
unusable space.

The solution to the problem is to scatter the individual pages of
the file:

P_r_o_p_o_s_a_l_: Files are organized by chaining of pages. Every page of
a file contains a pointer to the next page.

Again a relative simple strategy: the file is identified by the
address of its first page, organization overhead is small in
terms of core storage and file allocation, deletion, extension,
and shrinking may be fast.

The chief deficiency of the method is the use of variable time
for accessing of random pages.

This could be solved by holding the chain fields of the file in
core storage, instead of distibuting them on the individual
pages.

P_r_o_p_o_s_a_l_: Files are organized by chaining of pages.
Chains are held in core storage.
(RC System 3)

Access time is still variable but tolerably so (at least for
files not excessively huge). However, organization overhead in\f

terms of core storage is considerable. The only realistic way to
employ the method is to hold a s_i_n_g_l_e_ chain table (perhaps per
volume) in core, describing all files as well as free pages. If
such a scheme is used many files may be open at the same time,
since opening of yet another file will not add notably to the
required core size.

On the other hand the total number of backing storage pages con-
nected to the system will be severely limited by the core size.

This may be partially overcome by making the pagesize very big
-- or better: address pages in "chunks" (called slices), making
files consist of a number of slices, each referencing a fixed
number of contiguous pages. This allows the chain table to span a
greater number of pages using the same amount of core. Unfortuna-
tely this method introduces an unpleasant dependence between the
size of available core storage and organization of backing sto-
rage (i.e. the size of the slices). Restricted core size implies
big slices with consequent wasting of backing storage (in mean
half a slice is wasted per file). Furthermore the choise of slice
size may be difficult, because volumes could be exchanged between
computer systems of different available core size.

Another drawback is the sensitivity of the chaintable (which of
course must be stored on backing storage as well) towards errors
disrupting small parts of it. A ruined chaintable of a few pages,
makes the whole volume of possibly several hundred thousand pages
inaccessible.

Finally, the total number of files is limited (at least by the
number of chain fields in the chain table). This may be tolerable
on many computer installations, but it is not tolerable in gen-
eral, since the file system should be applicable on an assorted
range of configurations.

P_r_o_p_o_s_a_l_: Files are organized by index tables. Each file contains
such a table on its first page.
(PDP SOLO, RC3600, OS6)
\f

This method makes the core size overhead for file organization
proportional to the number of opened files. Naturally it limits
the number of opened files on a given installation, but the pro-
blem is tolerable, since the number of concurrently opened files
is prone to grow with the size of available core on different
computer configurations (demands to the degree of "multi-using"
naturally tends to grow with the available core size: small core-
size micro-computers are usually expected to be single-user sys-
tems, for instance).

Extension and shortening of files is simple, assuming the needed
free space is available in the index table.

Allocation and deletion of files does not involve time consuming
activities, as far as the FOSM-level is concerned.

Sequential access time is a function of the allocation strategy
employed when the file was created (and perhaps extended). The
actual allocation of pages might very well be contiguous in many
cases.

Yet, the method i_s_ rather inefficient in terms of core storage.
This means that the index tables should be as compressed as pos-
sible. This is not naturally compatible with the objective of ar-
bitrarily big file sizes. So, let us turn the problem of organi-
zation of index tables.

6_._2_._2_ _ _ _ _ _O_r_g_a_n_i_z_a_t_i_o_n_ _o_f_ _I_n_d_e_x_ _T_a_b_l_e_s_ 6.2.2

The organization of an index table has several degrees of
freedom:

- structure of the individual indices,
- the address space of the individual indices,
- representation of index tables in core,
- single-level or multi-level indices.

6_._2_._2_._1_ _ _ _S_t_r_u_c_t_u_r_e_ _o_f_ _t_h_e_ _I_n_d_i_v_i_d_u_a_l_ _I_n_d_i_c_e_s_ 6.2.2.1

P_r_o_p_o_s_a_l_: The individual indices reference chunks of pages. The
chunks are of fixed size (e.g. one page).
(PDP SOLO, OS6)

Gives easy and fast indexing of a given page. Extension or shor-
tening is simple (if there is room enough in the indextable)
although it might involve moving of indices within the table.

Unfortunately the method is grossly core-consuming, as the index-
table of a file will grow proportionally to the file size. This
core-consumption may be reduced by employing following method:

P_r_o_p_o_s_a_l_: The individual indices reference chunks of pages. The
chunks are of variable size (multiplum of slice size).
(RC3600)

In this case the number of indices is proportional to the degree
of scattering of files: if a file is contiguously allocated, only
one index is needed. In the worst case a file will be totally
scattered, requiring double as much space as in the previous pro-
posal (an index contains two words: address and length). However,
if a huge file is so scattered that the latter proposal requires
more space than the former, it might be advisable to reorganize
the volume anyway. Reorganization should compress files (i.e.
allocate them contiguously).

Finding the address of a given page may involve some searching
in the index table, but if the searching is conducted in core
this is absolutely tolerable.

The only notable drawback is that this organization does not al-
low refinement in connection with the concept of "sparsely fil-
led" files. Such files (e.g. large random accessed hash coded
files where only a few of its pages actually contain data) could
easily be implemented by the previous proposal (just placing
zeroes (say) in the indices referring to empty pages not yet
allocated). However, this is not considered a serious problem,\f

since such file organizations also could be implemented on top of
the latest proposed organization. This would just require a table
(contained in the file) mapping the hashcoded page numbers into
the actual (internal) page numbers. Since the problem was that of
s_p_a_r_s_e_l_y_ filled files, this method would not introduce much
overhead compared to the first one.

All in all the method is preferable to the first proposed.

6_._2_._2_._2_ _ _ _T_h_e_ _A_d_d_r_e_s_s_ _S_p_a_c_e_ _o_f_ _t_h_e_ _I_n_d_i_v_i_d_u_a_l_ _I_n_d_i_c_e_s_ 6.2.2.2

When using computers with a word-length of 16 bits, there is a
problem of obtaining sufficient address space for large devices.

A large device may easily comprise several hundred thousand
pages. One 16-bit word will only address up to 64 K pages. What
to do?

P_r_o_p_o_s_a_l_: A new abstract level is introduced: a u_n_i_t_.
A unit comprises up to 64 K pages. A device may
comprise one or several units. Each unit is an
independent address space.

This ensures the address space to be maximally 64 K. The trade-
off is the added complexity in the interface to devices (you
should not only distinguish devices but units too). Furthermore
files are restricted to the size of 64 K pages.

P_r_o_p_o_s_a_l_: A pageaddress consists of 32 bits. Thus the address
space is 4 G pages.

A 4 G range for pages is certainly large enough (if there are 512
bytes per page the maximum address space comprises 2 Tb = 2 x 10Uu12Dd
bits), and that is larger than any currently available storage
device).
\f

This method may work well with processors offering 32-bit arith-
metic -- which will be the case of many of the coming micro-pro-
cessors: Motorola MC68000, Zilog Z8000, Intel 8086, etc. Unfortu-
nately it is not the case of many current computers. If no "more-
word arithmetic" is supported, it will be very tedious to use
32-bit arithmetic.

Considering the fast development of computers, it is chosen to
solve the problems by brute force: All page-addresses consist of
32 bit, but no double-word arithmetic is applied in computers
where this is not absolutely needed. In most cases an address
space of 64 K pages i_s_ enough (at least for microcomputers) and
if a larger address space is needed this will often be connected
with larger computers (offering 32-bit arithmetic)...

6_._2_._2_._3_ _ _ _R_e_p_r_e_s_e_n_t_a_t_i_o_n_ _o_f_ _I_n_d_e_x_ _T_a_b_l_e_s_ _i_n_ _C_o_r_e_ 6.2.2.3

When a file has been opened it is natural to place some
administrative data in core storage, easing access to the file.
Amongst such data is the index table for the file. If this is n_o_t_
placed in core, an access to an arbitrary page of the file will
involve two backing storage accesses -- first to retrieve
information from the indextable concerning the location of the
requested page, followed by the actual access.

P_r_o_p_o_s_a_l_: The indextable of a file is placed in core storage
whenever the file is opened (it is of course removed
when the file is closed).
(PDP SOLO)

This is the natural thing to do. Access times for arbitrary pages
of the file are kept minimal. Furthermore it fulfils the ever im-
portant property of access time predictability.

On the other hand a lot of files may be open at the same time,
consuming a lot of core storage for their indextables.
\f

In matter of efficiency this may be acceptable on large com-
puters. But it is certainly not attractive in connection with
small microcomputers and the like. Again, as the prime objective
of the file system is to span the whole range of computers, the
consumption of core for indextables must be restricted.

P_r_o_p_o_s_a_l_: Only part of an open file>s indextable is put in core.
When a page not referenced in this part is requested,
the whole indextable will be read from backing storage
and a new part (according to some clever strategy) is
put in core.
(RC3600)

This is in fact a method of "paging" the indextables. The nice
thing about the method is that arbitrarily big and complex
indextables (e.g. in several levels) could be allowed easily.

Unfortunately the trade-offs are obvious:

If the size of the core part of an indextable is smaller than the
whole indextable for a given file, chances are - naturally - that
an arbitrary access to the file, will reference a page "outside"
this part (give a "page fault"). Thus the mean access time for
the file will increase -- in fact proportional to the degree of
fragmentation of the file. Furthermore predictability of access
times will be lost (for the user, whom the actual organization of
files is irrelevant).

One could design various crafty strategies for the continous re-
newal of the core part of index tables. Many such strategies are
known from the litterature on program-execution involving paging-
methods. It is also well-known that these techniques work relati-
ve well as long as programs are executed sequentially. However,
when execution involves jumps to arbitrary points in the program,
there is a danger of severe decrease in execution speed -- so-
called "trashing".
\f

The problem of "paging" the indextable of an open file are ana-
logous. As long as accessing is sequential, everything may work
relative well. But if access is random (which is often the case
for datafiles) trashing may occur no matter how clever the paging
strategy.

This is obviously a dilemma between antagonistic objectives: in-
dextables must be used (allowing scattering of files) -- and:
the importance of fast access to the individual pages of an open
file.

The compromise may lie in the possibility of "compressing" files
to a reasonable degree of scattering (i.e. organize the files in
a limited number of chunks, thereby diminishing the number of in-
dices in index tables).

P_r_o_p_o_s_a_l_: Indextables are restricted to a limited number of
chunks. Indextables of open files are core resident.

When the size of indextables is restricted, they will -- occasio-
nally -- show full, but this is not an irrepairable situation, as
the backing storage could be reorganized. Admittedly it may cause
some disadvantage to user, but this should be balanced by the
ever minimal and always predictable access times.

One may consider implementing automatic reorganization, initiated
by the file system whenever needed. However, this will add unac-
ceptably to the complexity of the file system nucleus. If it is
essential to have a given volume reorganized continuously, a
reasonable way would be to put the reorganization utility program
in an individual low-priority process of its own. Of course this
will require some careful synchronization between file system and
reorganization process.

The maximum number of indices in the index-table is hard to de-
fine. The number 25 has rather arbitrarily been chosen, hopefully
balancing the need for low consumption versus the realistic
degree of fragmentation.
\f

Judging from the empirical figures of section 3.3, the choise is
not too bad: If a slice size (i.e. minimal allocation unit) of 3
is chosen, 25 chunks will be able to reference at least 75 pages
in the very w_o_r_s_t_ of cases (that is to say: no two chunks can be
placed contiguously). The tables show that more than 80 percent
of all files were of less size.

6_._2_._2_._4_ _ _ _S_i_n_g_l_e_-_L_e_v_e_l_ _o_r_ _M_u_l_t_i_-_L_e_v_e_l_ _I_n_d_i_c_e_s_ 6.2.2.4

The above chosen method for core representation of index tables
inhibits any practical use of multilevel indexing. It should be
noted however, that a sort of multilevel indexing system could be
implemented o_n_ _t_o_p_ of the here described file system. The higher
levels of indices might be represented by individual files con-
taining references to other files.

Since a high-level interface to the file system obviously may
utilize such techniques, there is no urgent need for multi-level
techniques in the file system "nucleus".

6_._2_._3_ _ _ _ _ _D_y_n_a_m_i_c_s_ _o_f_ _F_i_l_e_s_ 6.2.3

It was stated as an objective of this level that files should be
able to grow or shrink -- at least at the tail -- but preferably
anywhere in the file.

Why demand this of a file system?

Well, of course any possible problem of this nature might be sol-
ved by simply copying the file into a new file with the wanted
format. But copying of huge files is a time-consuming activity.
Since appending new portions of data to a file is a frequent
operation, it is not feasible to copy every time. Alas, e_x_t_e_n_d_i_n_g_
of files (at the tail) must be a facility of the file system.
\f

When a file is extended, we want to allocate the pages of the
extension in as few chunks as possible -- all in line with the
general philosophy of allocation. If a file is extended one page
at a time, there will be no possible way to determine a reason-
able strategy, as to which free pages to allocate.

The alternative is to allocate all the needed pages at once. If
the number of needed pages is not known at the time of extending,
it may be profitable to allocate the maximum-number of pages
possibly needed. This implies that it must be a facility of the
file system to t_r_u_n_c_a_t_e_ a file (getting rid of the unwanted pages
after an extension).

These are the arguments for dynamics in the tail of a file. Of
course they may be applied to the corresponding problem of dyna-
mics anywhere in a file.

However, one should balance the need for general dynamics against
the added complexity to the system.

It occurs that examples of natural applications using general dy-
namics are: (1) text editors and (2) random access file organi-
zations.

As regards the text editors, arbitrary dynamics might make fast
insertion of pieces of text possible. On the other hand it is
hard to imagine a text editor system not preserving a copy of the
old text for security reasons. So it seems that text editors are
not really a strong argument for the general dynamics.

Random access file organizations are probably a much better ar-
gument. General dynamics may facilitate to make "cleaner" designs
of such organizations, for instance hiding overflow problems.

However, the present design will argue that problems c_a_n_ be sol-
ved within the framework of tail dynamics, and the file system
will consequently be restricted on this point.
\f

If the need for general dynamics becomes prevalent, the file sys-
tem could be expanded with moderate effort to meet such demands.

6_._2_._4_ _ _ _ _ _U_n_r_e_a_d_a_b_i_l_i_t_y_ _o_f_ _D_e_a_l_l_o_c_a_t_e_d_ _P_a_g_e_s_ 6.2.4

One of the stated objectives is to protect deallocated pages from
being read by arbitrary users, allocating the same page later.

There are two possible solutions to this: (1) overwriting deallo-
cated pages (e.g. by zeroes) or (2) administratively prohibiting
any possible reading of such pages.

The former solution is obviously the safer one: data of
deallocated pages is p_h_y_s_i_c_a_l_l_y_ deleted, never to be accessed
again. On the other hand it will be rather time consuming.

Since the latter solution may be applied safely -- as shown below
-- this will be used in the final design.

P_r_o_p_o_s_a_l_: Reading of pages not yet written (in the context of the
actual file) is prohibited.

To fulfil this proposal we need two length-variables in the file
descriptor: (1) the total number of pages allocated to the file
and (2) the number of pages actually written onto the file.

Let us first consider sequential reading and writing. In this
case reading past the number of written pages should simply be
prohibited. This allows the user to see his own written data, and
nothing else.

Admittedly things look a little different for random access
files: When a random access file is created it must contain a
given number of pages regardless of whether there are any data on
these pages or not. However, this objection may be discarded,
observing that a random access file of arbitrary size may be
created by first creating and opening it for sequential writing,
secondly write the wanted number of pages (e.g. writing zeroes),\f

and finally -- after closing -- regard it as (and possible change
its attributes to) a random access file.

Thus unreadability of pages not yet written could be ensured by
obeying these simple rules:

- two length-variables are associated with each file:
ALLOCATED-LENGTH and WRITTEN-LENGTH.

- following assertions must hold:
(1) WRITTEN-LENGTH <_ ALLOCATED-LENGTH
(2) If page number R is read from the file:
R <_ WRITTEN-LENGTH
(3) If page number W is written onto the file:
W <_ WRITTEN-LENGTH + 1.

6_._3_ _ _ _ _ _ _ _F_u_n_c_t_i_o_n_s_ _o_f_ _t_h_e_ _F_O_S_M_-_L_e_v_e_l_ 6.3

Functions on the level of FOSM are:

open (descriptor, mode)
close

get (buffer, page _no, page _count)
put (buffer, page _no, page _count)

extend (amount)
truncate (amount).

The only function that requires explicit identification of the
target file is "open". Reason is: it is assumed that open initia-
lizes a (not-in-use) "FOS-Module" which then will represent the
file. The rest of the functions will accordingly, in some way --
dependent on the implementation language -- refer to such a spec-
ific module. The functions are thus "entries" (in the Concurrent
Pascal (Brinch Hansen 1977) sense of the word) to any of the
individual modules.
\f

D_e_s_c_r_i_p_t_o_r_ is the file descriptor found by the Basic File System.
It is used for initialization of the "FOS-Module". M_o_d_e_, b_u_f_f_e_r_,
p_a_g_e_ _n_o_, a_m_o_u_n_t_, and p_a_g_e_ _c_o_u_n_t_ is described in section 5.3.

6_._4_ _ _ _ _ _ _ _C_o_n_c_l_u_d_i_n_g_ _R_e_m_a_r_k_s_ 6.4

The design of the FOSM-level is essential to the performance of
the whole file system.

None of the p_r_i_m_a_r_y_ objectives of section 6.1 are neglected.

However, both of the s_e_c_o_n_d_a_r_y_ objectives are chosen n_o_t_ to be
met in this design: Files are n_o_t_ allowed to grow or shrink any-
where, and files are n_o_t_ allowed to encompass more volumes of
backing storage:

As regards the reasons for the limited dynamics of files, this is
extensively discussed in subsection 6.2.3.

The limitation of files to single volumes is not considered
really serious. Backing storage devices must be chosen to fulfil
the demands of the applications using them!

F_ 7_._ _ _ _ _ _ _ _ _D_E_V_I_C_E_ _S_T_R_A_T_E_G_Y_ _M_O_D_U_L_E_S_ _(_D_S_M_)_ 7.

The modules on this level define and describe the (possibly
different sorts of) devices on which backing storage is mounted.

Since the physical characteristics of the various known devices
may differ a lot, miscellaneous optimizing access strategies m_a_y_
be embedded in the modules.

The possible strategies for a given device are split in two
groups: (1) sequencing of i/o-requests and (2) allocation/deallo-
cation of pages on the device.

7_._1_ _ _ _ _ _ _ _O_b_j_e_c_t_i_v_e_s_ 7.1

Prime Objectives:

- when a file is created or extended with a number of pages,
these pages should be allocated in as few chunks as possible
(i.e. preferably contiguous). Reasons are: minimizes size of
indextable; makes loading of programfiles fast.

- allocation/deallocation should be reasonably fast.

- pages developing faults must be marked, so that they are
unallocatable.

Secondary objectives:

- it must be easy to embed new sequencing strategies (for i/o-
requests) in the modules. This is primarily a requirement to
the implementation structure of the file system.
Reason is: it is not considered part of the file system nucleus
to employ sequencing strategies, since these are not imperative
to the function of the file system (a given device may be ac-
cessed with or without sequencing strategies with equal results
-- but presumably with differing efficiency). On the other
hand, sequencing may improve efficiency considerably on some
computer installations (e.g. in multiprogramming, multi-user
environments), and should therefore be readily supported by the
file system.

7_._2_ _ _ _ _ _ _ _D_e_s_i_g_n_ 7.2

7_._2_._1_ _ _ _ _ _S_e_q_u_e_n_c_i_n_g_ _S_t_r_a_t_e_g_i_e_s_ 7.2.1

Strategies for sequencing i/o-requests are -- as said above --
highly device dependent and are not an immediate part of the de-
sign to be discussed. Please refer to the litterature, e.g.
(Teorey et al. 1972) or (Wong 1980), for further discussion on
different algorithms.

7_._2_._2_ _ _ _ _ _A_l_l_o_c_a_t_i_o_n_/_D_e_a_l_l_o_c_a_t_i_o_n_ _o_f_ _P_a_g_e_s_,_ _A_d_m_i_n_i_s_t_r_a_t_i_o_n_ 7.2.2

The DSM must contain datastructures and algorithms to administer
the allocation and subsequent deallocation of backing storage
pages.

Various well-known techniques to do this job, exist.
These will be presented below:

P_r_o_p_o_s_a_l_: Free pages are c_h_a_i_n_e_d_ together.
(RC System 3)

This method has very little overhead in terms of core consump-
tion. The only thing to store is the address of the next page to
allocate (deallocated pages may be inserted before this, or they
may be inserted elsewhere in the chain). The chainfields could be
located on the individual pages, as the pages are not used for
data.

However, when allocation strategies are employed, trying to find
a suitable c_h_u_n_k_ of contiguous pages, the method is not attrac-
tive. Search times will be excessive.

If the chainfields were to be placed in a chaintable in core sto-
rage, the method would show distinct similarities with the method
of chaining file pages. This method was not adopted. For analo-
gous reasons the corresponding method will not be applied here
either.
\f

P_r_o_p_o_s_a_l_: The free pages are organized as a f_i_l_e_, which grows or
shrinks as pages are deallocated or allocated.

This method might work very well if indextables were big enough
allowing arbitrarily fragmentated files. The notable point of the
method is that the tools (i.e. file system procedures) needed to
maintain such a datastructure are already present; why, they are
needed in the user interface to manipulate ordinary files. The
method might be a little inefficient in terms of execution time.

The indextable organization chosen in subsection 6.2.2.3 is not
sufficient for this method.

P_r_o_p_o_s_a_l_: The free pages are administered by a "bit table". In
this table every bit represents a page. If the bit is
set (to "1") this means that the corresponding page
is free; if not set (set to "0") it is allocated.
(PDP SOLO, RC3600)

This gives an excellent survey of the free pages. It is easy to
compute the size of individual "holes" of free pages, allowing
efficient searching.

The method may be rather core consuming. A typical-sized volume
currently holds from 100.000 to 1.000.000 pages (if page size is
512 bytes). This will require a bit table of 6 to 63 kilobytes.
This is of course not realistic.

P_r_o_p_o_s_a_l_: The free pages are administered by a bit table. The bit
table is a file itself and as such "p_a_g_e_a_b_l_e_".

This gives a relative uniform approach to the handling of bit
tables on different sorts of devices, large as small. On the
other hand, allocation -- and especially deallocation -- may
involve several "page faults" decreasing efficiency.
\f

P_r_o_p_o_s_a_l_: The free pages are administered by a bit table. Every
bit references a fixed number of pages. This number
constitutes the minimal unit of allocation (called a
"s_l_i_c_e_").
(RC3600)

This method utilizes the idea that pages may be allocated profit-
ably in units of several pages. Reasons are:

- most files are of size greater than one page.

- files should as far as possible be allocated contiguously.
Allocation in slices tends to help to this purpose.

- extending of files involving allocation of pages will tend to
be less frequent. Often there will be space enough in the last
allocated slice for minor extensions.

- "holes" of free space on the backing storage tend to be larger
(measured in number of pages), because pages are deallocated in
units of slice length too. Large holes are more useful for al-
location purposes.

The main draw-back of the method is the amount of wasted space on
backing storage. This is proportional to the slice size, since
the theoretical mean size of wasted space per file is half the
slice length.

The actual slice size should therefore be balanced according to
the above pro et contras.

The slice size need not be a universal constant, it could very
well vary from volume to volume. In fact it might be feasible to
let the slice size on large volumes tend to be large (as the
waste of space here might be of relative little importance), and
oppositely, let the slicesize of small volumes be small. Other
factors may influence the choise of slice size, such as the ex-
pected mean size of the files to be placed on the volume: the
space wasted is relative little when files are huge -- a file of\f

1000 pages with a 10-page slicelength wastes a maximum of one
percent of the occupied space; not so a file of 1 page (up to 90
percent waste).

Even though this technique may be employed, it is not reasonable
to restrict the size of the bittable to one page (or any other
universally fixed size), because this will introduce an unwanted
restriction on the number of files possible on any volume (the
number of files would be limited by -- at least -- the number of
bits in the bittable).

The compromise is to apply both of the two last methods proposed,
thereby getting a "pageable" table of free slices.

P_r_o_p_o_s_a_l_: The free slices are administered by a "pageable" bit-
table.

If time consumption by allocation and deallocation is critical,
one should choose a slice-size producing a bit table of size less
than a page (eliminating "page faults") remembering the trade-off
in terms of wasted space on backing storage.

7_._2_._3_ _ _ _ _ _A_l_l_o_c_a_t_i_o_n_/_D_e_a_l_l_o_c_a_t_i_o_n_ _o_f_ _P_a_g_e_s_,_ _S_t_r_a_t_e_g_i_e_s_ 7.2.3

The stated objective of allocation strategies (cf. section 7.1)
was to allocate the requested number of pages in as few chunks as
possible.

This objective was under the assumption that contiguously located
pages minimize access time when access is sequential.

The postulate is not true for all devices. It is not all device-
controllers that allow access to an arbitrary number of conti-
guous pages (sectors, segments, etc.). Some demand every block
accessed individually. This will defeat the postulated minimal
access time.
\f

Take as an example a hard disk device: the rotational gap between
two neighbouring pages may be too short to allow the system to
start another access. If pages were placed contiguously in such a
situation, a lot of disk revolutions would be lost when accessing
sequentially. Therefore the pages should be interleaved in such a
way that two consecutive pages on the same cylinder always are
placed with another page parting them. This implies that all
pages on (one side of) a cylinder could be accessed in two revo-
lutions.

The location of logically contiguous pages on two neighbouring
cylinders may also be optimized (i.e. considering the rotational
gap for cylinder shifts).

The technique is refered to as "s_c_r_e_w_i_n_g_" of pages.

The mapping between physical locations and logically contiguous
locations could be placed in software as well as hardware ("soft
screw" respectively "hard screw").

When placed in software this should preferably be a facility of
the driver for the device. Pages written with the use of a "soft
screw" will be practically impossible to read if not accessed via
the same "soft screw" technique. If the "soft screw" is not pro-
perly standardized, this may cause problems in connection with
exchanging of media between different installations.

It is simpler to use "hard screw" (if the physical charachteris-
tics of the device permit it). This is done by the "formatting"
the media, writing appropriate headers on the sectors of each
cylinder. This technique makes the software invariant to the used
screwing strategy.

Other devices may use other and more refined optimization stra-
tegies. We shall not elaborate further on this here, but it will
be assumed that such techniques may be incorporated in the file
system easily.
\f

All in all we may stick to the objective of allocation in as few
chunks as possible, since this -- under reasonable conditions --
will tend to minimize access time for sequential accessing.

The problem of allocation of pages -- by the means of a bit table
-- is therefore reduced to the well-known problem of dynamic sto-
rage allocation. Please refer to e.g. Knuth (Knuth 1969), Vol. 1,
page 435 ff. for a more detailed discussion of strategies. In
accordance with the conclusions of Knuth we will choose the
"f_i_r_s_t_-_f_i_t_" method as the basis of allocation strategy. If no fit
is found, the requested number of pages will be split into two
approximately equal parts, and the algorithm will be used (recur-
sively) on both of these parts. This modification to the first
fit method is totally heuristic. It would be nice to try alterna-
tive techniques in a simulation model before application.

However, it is important to notice, that the allocation strategy
may be changed at any time, since it has no effect whatsoever on
the system>s ability to access already allocated pages. Thus, if
a better strategy is found, this will be applied (different stra-
tegies may even be used in different environments).

7_._2_._4_ _ _ _ _ _M_a_r_k_i_n_g_ _o_f_ _M_a_l_f_u_n_c_t_i_o_n_i_n_g_ _P_a_g_e_s_ 7.2.4

Pages, which develop persistent errors, should be marked in some
way to ensure future unaccessibility.

It is natural to use the Bit Map for this. Slices containing
malfunctioning pages are considered "allocated", though not to
any specific file.

This is, however, more easily said than done.

A malfunctioning page can only be detected during r_e_a_d_i_n_g_ of the
page (if a write-operation is specified to an ordinary disk-con-
troller, it will write the data unconditionally - no matter how
defect the page may be).
\f

This means that malfunctioning pages typically are detected
during reading of a file. In this situation we cannot just mark
the relevant slice as "allocated" in the Bit Map, since this is
the case already (it is allocated to the file we are reading)!
What to do?

P_r_o_p_o_s_a_l_: Whenever a file is d_e_l_e_t_e_d_, it will be check-read to
discover any malfunctioning pages. All pages, but the
malfunctioning ones, will be deallocated.

Simple to implement, but it is unacceptable in terms of time-con-
sumption. Consider deleting a file of 40.000 (say) pages --
patience will be required!

P_r_o_p_o_s_a_l_: A separate bit table -- the Malfunctions Bit Table
(MBT) -- is maintained. The table references all mal-
functioning slices. Whenever a malfunctioning page is
discovered the associated slice will be marked in MBT.
Whenever a slice is deallocated, it is checked in the
MBT that this slice is ok. If it is not ok, no bit will
be set in the ordinary Bit Table of free slices.

This method does not add notably to system overhead. Furthermore,
it has the additional advantage of maintaining a complete table
of the malfunctioning areas of the backing storage. Such a table
may prove useful in a number of situations. Thus, this last
method will be chosen.

7_._3_ _ _ _ _ _ _ _F_u_n_c_t_i_o_n_s_ _o_f_ _t_h_e_ _D_S_M_-_L_e_v_e_l_ 7.3

The set of functions of a single Device Strategy Module will look
something like:

get (buffer, page _addr, page _count)
put (buffer, page _addr, page _count)

mount (volume _description)
dismount

allocate (amount, chunk)
deallocate (chunk)

B_u_f_f_e_r_ and p_a_g_e_ _c_o_u_n_t_ are described in section 5.3. P_a_g_e_ _a_d_d_r_ is
the address -- within the given volume -- of the first page to be
transferred, or marked defect. V_o_l_u_m_e_ _d_e_s_c_r_i_p_t_i_o_n_ contains in-
formation concerning the characteristics of the volume to be
mounted; used for initialization of the module. A_m_o_u_n_t_ is the
number of slices to be allocated by the "first-fit" method. C_h_u_n_k_
indicates location and length of the chunk allocated or (to be)
allocated.

7_._4_ _ _ _ _ _ _ _C_o_n_c_l_u_d_i_n_g_ _R_e_m_a_r_k_s_ 7.4

The design of the DSM-level as described above, meets all of the
stated objectives of section 7.1.

One should notice that the DSM-level must be implemented with
care, since several demands to the flexibility and modularity of
the implementation have been stated.

F_ 8_._ _ _ _ _ _ _ _ _I_N_P_U_T_ _O_U_T_P_U_T_ _C_O_N_T_R_O_L_ _S_Y_S_T_E_M_ _(_I_O_C_S_)_ 8.

This level defines the actual software interface to the hardware
of external devices.

Furthermore it performs functions such as: error retrial,
exeption handling, etc.

Since the level is highly implementation and configuration de-
pendent, it will not be elaborated in detail here. In fact only
one single aspect will be treated: the size of pages (a page
being the minimal quantum of bytes accessed from an arbitrary
backing storage device, in one operation of the IOCS).

The objectives and discussion is accordingly restricted to this
single issue.

8_._1_ _ _ _ _ _ _ _O_b_j_e_c_t_i_v_e_s_ 8.1

Prime objectives:

- the fixed page size should balance the factors of buffer claim
(on core storage), and the transfer rate of data.

Secondary objectives:

- the page size should support all common formatting standards of
backing storage devices (allowing use of pre-formatted media).

8_._2_ _ _ _ _ _ _ _D_e_s_i_g_n_ 8.2

The effective transfer rate (in bps) is higher for large p_a_g_e_
s_i_z_e_s_ than for small (as the administration overhead is approxi-
mately constant). This fact points to large page sizes.

Core storage is a critical resource. Large page sizes consume a
lot of core for buffers. This points to small page sizes.
\f

Balancing these factors is hard because circumstances may be so
different from installation to installation (core storage is cer-
tainly a critical resource on a 8-bit microcomputer with a maxi-
mum address-space of 64 Kb, as well as high transfer rates may be
essential in a large multi-user system).

We have -- rather arbitrarily, but hopefully balancing the above
mentioned factors -- chosen a page size of 512 bytes.

This page size supports the most common sector size standards for
floppy disks. As "standard" for floppy disks (diskettes), the IBM
formats are generally prevailing. The clasical IBM standards are:
128, 256, and 512 bytes per sector (ref.: IBM not dated).

These may all be supported by the chosen page length of 512
bytes/sector, since more than one sector may constitute a page
(e.g. 4 sectors of 128 bytes may be read into a 512 byte page-
buffer).

IBM has introduced another format of 1024 bytes/sector lately.
This cannot -- without considerable difficulties -- be supported
by a 512 byte page size! However, the format does not -- as of
this time -- seem to be generally accepted, and predictions are,
that it will be dropped or at least limited in use. Since the
expences of a 1024 bytes pagesize format are considerable in
respect to core consumption of pagebuffers, we choose not to be
impeded by the small chances of a 1024 bytes/sector standard
being accepted.

8_._3_ _ _ _ _ _ _ _F_u_n_c_t_i_o_n_s_ _o_f_ _t_h_e_ _I_O_C_S_-_L_e_v_e_l_ 8.3

As stated above the actual interface to the IOCS may be rather
environment-dependent. A general outline for the functions may
look like this:

get (buffer, page _addr, page _count)
put (buffer, page _addr, page _count)

control (request)

B_u_f_f_e_r_, p_a_g_e_ _a_d_d_r_ and p_a_g_e_ _c_o_u_n_t_ are described in section 7.3.
R_e_q_u_e_s_t_ denotes the wanted control-function of the driver.

8_._4_ _ _ _ _ _ _ _C_o_n_c_l_u_d_i_n_g_ _R_e_m_a_r_k_s_ 8.4

The stated objectives of section 8.1 are met.

F_9_._ _ _ _ _ _ _ _ _A_C_C_E_S_S_ _M_E_T_H_O_D_S_ _(_A_M_)_ 9.

The Access Methods is the user>s interface to the file system.

The facilities of this interface is highly dependent on applica-
tions, environments, etc. It may comprise techniques such as:
fixed or variable length record accessing, single or multibuf-
fered accessing, random or sequential accessing, specialized
database record accessing, etc.

As stated in chapter 3, it is not considered part of this file
system design to define or discuss facilities of the AM-level.

It should be noted, however, that the file system has been de-
signed to support developments of different problem-oriented
access methods. If the AM-level had been incorporated into the
file system nucleus, the user would have been constrained by a
small set of limited access methods.

F_ 1_0_._ _ _ _ _ _ _ _L_O_G_I_C_A_L_ _F_I_L_E_ _S_Y_S_T_E_M_ _(_L_F_S_)_ 10.

The primary component of this module is the c_a_t_a_l_o_g_ _s_y_s_t_e_m_. This
is responsible for the mapping of (userdefined) symbolic n_a_m_e_s_ on
to the internal and unique file i_d_e_n_t_i_f_i_e_r_s_. Furthermore the mo-
dule offers authorization enforcement facilities.

In fact, the services and facilities of the LFS may be numerous.
Since one of our main goals for the file system nucleus is to
keep the size within limits, the actual LFS of this design will
be kept at a minimum, only offering basic functions. These func-
tions must be chosen with care, so that augmenting levels, imple-
menting more refined and potent naming systems, are well suppor-
ted.

The following subsections investigate the properties of naming,
and based on this analysis, decide on (1) the structure of cata-
logs, (2) a simple authorization enforcement scheme, and (3)
which basic functions the LFS should offer.

Search-techniques in catalogs are not discussed in this report.

1_0_._1_ _ _ _ _ _ _O_b_j_e_c_t_i_v_e_s_ 10.1

Prime objectives:

- the catalogs must be able to hold a very large amount of names
(preferably no fixed limits on the number of names in a cata-
log).

- flexible, but easy-to-understand rules for protection of other
user>s names.

- sharing of files between users must be allowed (if so specified
by the "owner" of the file).

- a set of access-rights must be associated with each name, so
that a simple authorization scheme can be enforced.
\f

- catalogs should be distributed on volumes (no single volume
should hold the names of files on every other volume, since
such a catalog could grow to enormous dimensions - and would be
a very vulnerable point in case of errors - "if the mastercata-
log goes, the whole system goes ...").

Secondary objectives:

- catalogs on different volumes of backing storage (for instance
disc-packs) should not be totally independent: inter-volume
references are a practical tool. The danger of such references
are that they may lead to repeated mounting of volumes just to
continue a path in a chain of references.

1_0_._2_ _ _ _ _ _ _D_e_s_i_g_n_ 10.2

The discussion of the Logical File System is divided into follow-
ing main items:

- a model for naming
- how should files be named?
- how should the catalog system be structured?
- how should a simple authorization enforcement scheme be
implemented?
- deleting of files

1_0_._2_._1_ _ _ _ _A_ _M_o_d_e_l_ _f_o_r_ _N_a_m_i_n_g_ 10.2.1

In this subsection we will consider filenames as such, i.e. the
string of characters representing a symbolic reference to the
data structure: a "file". We will not be bothered with reflecti-
ons on how to organize the underlying data which represents the
names.

1_0_._2_._1_._1_ _ _S_p_e_c_i_a_l_ _O_b_j_e_c_t_i_v_e_s_ _o_f_ _N_a_m_i_n_g_ 10.2.1.1

Following special objectives concerning the naming of files could
be drawn (in accordance with the general objectives of the Logi-
cal File System):

- File names are chosen and used by humans: naming rules
must allow names which are easily recognized and remembe-
red by humans.

- In a file system a file is considered an o_b_j_e_c_t_ of the
class of files. It is natural to divide the class into
groups of objects in different ways. A mechanism of divi-
ding often used, is the hierarchical structuring (since it
mirrors the common structure of labour-organization --
amongst human workers as well as tasks/projects). However,
other structures of division may be used too.

- The set of accessible filenames should not be static. Dif-
ferent users may see different subsets (maybe even totally
independent subsets).

- Users seeing separate subsets of filenames must be able to
name distinct files with the one and same name. User A may
see a file "PIP" which has nothing to do with the file
"PIP" of user B.

- Different users with shared access to a single file must
be able to name this file individually, i.e. a file may
have several names.

1_0_._2_._1_._2_ _ _N_a_m_e_s_:_ _A_n_ _A_b_s_t_r_a_c_t_ _M_o_d_e_l_ 10.2.1.2

Now, let us try to express the notions of "naming" abstractly.
First of all we define the words: "context", "partial filename",
and "complete filename".
\f

The objective of allowing one name to represent several separate
files means that the interpretation of a filename depends on the
c_o_n_t_e_x_t_ in which it is used.

This leads to the notion of "c_o_m_p_l_e_t_e_" and "p_a_r_t_i_a_l_" filenames. A
complete filename is -- by definition -- unambiguous. Partial
filenames may very well be ambiguous, but the context in which
they are used will determine their corresponding complete file-
names -- thereby giving them exact and unambiguous interpretati-
ons.

The construction of complete filenames may be a facility of the
AM-level. It will automatically append some sort of context
identification (prefix) to any user specified filename. This
implies that a filename always is unambiguous (complete) in the
interface between AM and LFS.

The context of a user is defined and perhaps changed by some sort
of supervisory system (possibly embedded in the AM-level).

The properties of a naming system could be represented as a
directed graph G. The context of any user is represented by a
unique vertex of the graph G, called his "context-vertex". A user
is said to have "access to" a vertex V if and only if there is an
(not necessarily unique) oriented path (eDd1Uu, eDd2Uu, ---, eDdnUu)
from V to his context-vertex CV (eDdiUu is the name of edge i in the
digraph). The tuple (eDd1Uu, eDd2Uu, ---, eDdnUu) is called the "partial name"
of vertex V as seen from CV. Vertices with in-degree equal to
zero are called "files", all others are called "catalogs". This
implies that a user with context-vertex of type "file" has access
to n_o_ other vertices, whereas a user with context-vertex of type
catalog may have access to a number of vertices -- in fact to the
sub-digraph of which the context-vertex is a root.
\f

(tegning)

Figure 10.1: Example of a digraph structure.

E_x_a_m_p_l_e_: User A with context-vertex C1 has a_c_c_e_s_s_ _t_o_ C1, F1, C3,
F2, F3, and F4.
User B with context-vertex C2 has access to C2, F4, F5,
&_ and F6.

M_M_m_ P_a_r_t_i_a_l_ _n_a_m_e_s_ of F1 as seen from C1 could be (e ), (e ,
P_P_p_ 2 1
M_M_m_ e ), (e , e , e ), etc.
P_P_p_ 2 1 1 2

As F4 are accessible from both C1, and C2, partial names
M_M_m_ of F4 could be e.g. (e ), (e ), (e , e ), etc.
P_P_p_ 3 4 5 4

To interpret this abstract model in terms of the well-known noti-
ons of filenames (as a string of characters) and context (as the
state in which a user finds the file system, defining the subset
of filenames accessible), we define two f_u_n_c_t_i_o_n_s_ on an arbitrary
digraph D.

First, define following sets:

E = e e is an edge of D
M_M_m_ n
P = (e ,e ,...,e ) E (e ,e ,...e ) is an oriented path in D
P_P_p_ 1 2 n 1 1 n
V = v v is a vertex in D
C = c c defines a "context"
S = s s is a string of characters
\f

Then define the two functions:

NAME : P S
DEFINECONTEXT: C V

The situation of this point is:

Associated with every user of the file system is a context c
(this context may be defined in various ways: by a user-password,
by explicit command of the user, etc.). If we interpret the ver-
tices of file-type as file i_d_e_n_t_i_f_i_e_r_s_ (i.e. for every file in
the file system there exists one unique file-type vertex), we may
consequently interpret the oriented paths finishing in the vertex
defined by the value of D_E_F_I_N_E_C_O_N_T_E_X_T_(c) (and originating in a
file-type vertex), as representations of the accessible file-
names. The function N_A_M_E_ maps these paths into ordinary string
filenames.

1_0_._2_._2_ _ _ _ _H_o_w_ _S_h_o_u_l_d_ _F_i_l_e_s_ _b_e_ _N_a_m_e_d_?_ 10.2.2

Now the problem of choosing a reasonable naming method corres-
ponds to solving the following sub-problems.

- definition of restrictions on the digraph structure
- definition of the function NAME
- definition of the function DEFINECONTEXT.

1_0_._2_._2_._1_ _ _R_e_s_t_r_i_c_t_i_o_n_s_ _o_n_ _t_h_e_ _D_i_g_r_a_p_h_ _S_t_r_u_c_t_u_r_e_ 10.2.2.1

P_r_o_p_o_s_a_l_: Only one vertex of catalog type exists. The rest of the
vertices in the digraph are all of file-type. The di-
graph forms an oriented tree.
(RC system 2, PDP SOLO)
\f

Figure 10.2: Global catalog.
&_

This is the simplest imaginable naming system. Only a single non-
trivial context-vertex exists, namely the catalog-type vertex.
All users have shared access (at least at the level of naming) to
all files.

Of course this does not meet the objectives of the naming system.

P_r_o_p_o_s_a_l_: The digraph forms an oriented tree with three levels.
The root may be called the "Master File Directory"
(MFD); rest of the catalog-type vertices may be called
"User File Directories" or (UFD) "subcatalogs".
(TITAN, RC3600).

Figure 10.3: Two-level catalog structure.

Different users may have different context-vertices (correspond-
ing to the UFD>s) in this proposal. Naming is still very simple,
but it does not allow multiple naming of a file.

P_r_o_p_o_s_a_l_: The digraph forms an oriented tree with an arbitrary
number of levels.
(RC system 3)

Figure 10.4: Tree-structure.

As an example of this proposal we may take the RC system 3 naming
system (used on RC4000 and RC8000):

Slightly s_i_m_p_l_i_f_i_e_d_ we can describe the rules of accessibility
and protection as follows:

Every filename in the file system is connected with two
integers, forming an interval (called the baseinterval of the
filename).

Every process accessing the file system is connected with a
base-interval defining the accessible and protected filenames.

A filename can be used unrestrictedly from a given process, if
(and only if) the base-interval of the filename is completely
covered by the base-interval of the process.
\f

Furthermore a filename can be accessed (for reading) if the
base-interval of the filename completely covers the base-inter-
val of the process.

These two access rules obviously corresponds to t_w_o_ independent
oriented trees of the sort proposed: an "update-tree" and a
"read-only-tree".

The notable drawback of the proposal is that it does not allow
multiple naming of a file. This was a primary objective of the
LFS.

The problem may be overcome by slackening the restrictions, al-
lowing the file-type vertices to have out-degree greater than one
(this means that the digraph is not restricted to the oriented
t_r_e_e_ organization).

P_r_o_p_o_s_a_l_: The digraph is r_o_o_t_e_d_, but otherwise unrestricted.
(OS6)

This is the most liberal proposal possible. It offers completely
optional choise of naming system for any augmenting level.

The proposal will be adapted in the final design.

1_0_._2_._2_._2_ _ _T_h_e_ _F_u_n_c_t_i_o_n_ _N_A_M_E_ 10.2.2.2

We will make two restrictions on the function NAME:

M_M_m_ - two a_l_m_o_s_t_ identical paths (e , e , e , x) and (e , e , e , y)
P_P_p_ 1 2 3 1 2 3
should be named a_l_m_o_s_t_ identically (to be easily recognized
by the humans).

This could be fulfilled if:
NAME (eDd1Uu, eDd2Uu) = NAME(eDd1Uu) d NAME(eDd2Uu)
where d is some sort of character delimiter.
Reasonable and often used choises for d are ".", "/", or
":".
\f

- to ensure unambiguiety in connection with the above restric-
tion, we must define:
NAME(eDd1Uu) 0=/5 NAME (eDd2Uu)
if eDd1Uu and eDd2Uu both are edges initiating in the same vertex
(no two names in a catalog may be identical).

1_0_._2_._2_._3_ _ _T_h_e_ _F_u_n_c_t_i_o_n_ _D_E_F_I_N_E_C_O_N_T_E_X_T_ 10.2.2.3

The function DEFINECONTEXT maps the abstract notion of a "user-
context" into an actual vertex of the naming-digraph.

The user-context may be defined by means of passwords, explicit
command, jobnames, etc. This is definitely not a problem which
the file system nucleus should deal with. It will be left to aug-
menting levels.

1_0_._2_._2_._4_ _ _S_u_m_m_a_r_y_ _o_f_ _N_a_m_i_n_g_ _R_e_s_t_r_i_c_t_i_o_n_s_ 10.2.2.4

The naming of files may be represented by a d_i_g_r_a_p_h_. The vertices
of the digraph are of f_i_l_e_ type if their in-degree equals zero,
otherwise these are of c_a_t_a_l_o_g_ type. The digraph must be r_o_o_t_e_d_
and a_c_y_c_l_i_c_. Every edge of the digraph has a name (string of
ASCII-characters). The names of the edges ending in a catalog-
type vertex are the names c_o_n_t_a_i_n_e_d_ in this catalog. No two
identical names may be contained in the same catalog. The c_o_m_-
p_l_e_t_e_ name of a vertex (file or catalog), is the concatenated
string of edge-names (delimited by a special character such as
"."), so that the edges form a path from the vertex to the root.

1_0_._2_._3_ _ _ _ _S_t_r_u_c_t_u_r_e_ _o_f_ _t_h_e_ _C_a_t_a_l_o_g_s_ 10.2.3

This subsection deals with the actual implementation of the di-
graph structure described in subsection 10.2.2.1.

There are two types of vertices: files and catalogs. The imple-
mentation of file-type vertices has been discussed in great\f

detail in previous chapters. So it is the implementation of
catalog-type vertices that remains.

The problem is: how to represent catalog-type vertices by actual
data-structures?

P_r_o_p_o_s_a_l_: All catalog-information is collected in o_n_e_ file, a
"Master-catalog".

This proposal is obviously relative simple to implement. The file
should be organized allowing random access. Searching may
therefore be fast, since the organization scheme could be made to
fit the usage precisely.

The notable disadvantage of the proposal is, that the catalog
information is not distributed on the individual volumes: The
volume holding the Master-catalog must always be mounted! If
there is a lot of files on the backing storage, the catalog could
grow to enormous dimensions. This is naturally in conflict with
the objectives of the LFS.

P_r_o_p_o_s_a_l_: There is a single "volume-directory" file on each
volume, containing the collection of all the catalogs
of the volume.
(RC System 3)

This proposal fulfils the stated objectives. Still, the volume-
catalog may grow, containing numerous entries, if the volume is
big. It may be difficult to dimension the catalog optimally,
because it is practically impossible to predict the actual number
and distibution of entries. This may cause some degradation of
performance (overflow problems etc.).

P_r_o_p_o_s_a_l_: Every catalog-type vertex corresponds to an individual
catalog file.
(OS6)
\f

This is probably the "cleanest" design. Catalog files may be
dimensioned and organized individually, according to the typicals
of the files they reference.

The danger of the method is that "long" filenames containing
several concatenated simple names, may be rather time-consuming
to interpret, since every individual catalog in the "path" of the
filename must be (1) opened, (2) searched, and possibly (3)
closed.

However, this problem may be more or less remedied by opening
frequently used catalogs permanently.

The last proposal will be chosen for the final design.

1_0_._2_._4_ _ _ _ _A_u_t_h_o_r_i_z_a_t_i_o_n_ _E_n_f_o_r_c_e_m_e_n_t_ 10.2.4

Authorization enforcement is of great importance both in a
multi-user environment and in a single-user environment, since it
not only protects against maliciousness -- but also against
accidental errors.

Therefore it is a necessary and essential part of the LFS.

On the other hand it should certainly not be sophisticated,
adding too much complexity to the file system nucleus:
Sophistication should be left to augmenting systems.

The naming system described in the previous subsections partly
controls the users access to files (a user can only access files
which he can name). But this is not sufficient because it is an
"all-or-nothing" sort of enforcement. It is necessary to control
access to files in a more minute way.
\f

A reasonable set of possible authorizations for a user of a file
(catalog or data-file) are:

authorization to:
- RETRIEVE data
- UPDATE data
- ADJUST length of file
- DELETE file.

The main problem of authorization enforcement is: how are author-
izations granted to separate users of the same file?

It would involve substantial additional administration overhead
to the file system, if it were to maintain separate descriptions
of the individual users (so-called user-profiles), defining the
authority of each one. Such natural extensions will be left to
augmenting systems.

Instead we will use the naming system exclusively, for defining
"users". If two users are to have different authority relative to
a file, they must name the file individually, thus allowing dif-
ferent authorization attributes to be associated with each names.

In keeping with this convention, it is natural to define the
o_w_n_e_r_-name of a file, as the name under which the file were ori-
ginally created. If other names are linked to this name (i.e. re-
ferencing the same file) they must be g_r_a_n_t_e_d_ authority by the
owner-name.

A very simple -- but effective -- scheme for granting of author-
ities is: whenever a name A is linked to a name B, B will be
granted authorities which are a subset of A>s. The authorities
of a name may be changed if and only if the (not mentioned
earlier) authority AUTHORIZE is contained within the authoriza-
tion attributes of the name.
\f

E_x_a_m_p_l_e_:

A file is created under the name of A with following
authorities:

A: RETRIEVE, UPDATE, ADJUST, DELETE, AUTHORIZE.

That is: all authorities.

Since the user of A wants to protect his file against deleting,
he defines that only the authorities RETRIEVE, UPDATE, and
ADJUST may be granted to linked names:

A: R_E_T_R_I_E_V_E_ , U_P_D_A_T_E_, A_D_J_U_S_T_, DELETE, AUTHORIZE

If a name B is linked to A, it inherits following authorities:

B: RETRIEVE, UPDATE, ADJUST

The user of B -- in his place -- wants to protect the file
against any updating from subsequently linked names, and defi-
nes only RETRIEVE as inheritable:

B: R_E_T_R_I_E_V_E_, UPDATE, ADJUST

A name C linked to B, inherits:

C: RETRIEVE

In case of compound filenames (i.e. concatenated names, e.g.
"CAT1.CAT2.PIP") the set of authorities in effect is the
i_n_t_e_r_s_e_c_t_i_o_n_ of the sets for each of the individual simple names.

Thus, all the files of a catalog may be protected against upda-
ting, if the UPDATE authorization is absent from the authority of
the catalog name. In fact it will not even be possible to insert
new, nor delete existing names in the catalog.
\f

E_x_a_m_p_l_e_:

Consider a multi-user system in which every user has his own
private subtree catalog structure. The authorizations associated
with the root of this subtree are presumably unrestricted. To
prevent interference between users, no user subtree may contain
backward references to catalog levels beyond that of the subtree
root.

On the other hand, a number of files must be sharable between the
users: utility programs, system programs, compilers, etc. This
may easily be accomplished by the scheme outlined in figure 10.5:

Figure 10.5: Example of multi-user directory structure.
Authorizations are showed at the edges.

It will not be possible for any user (who has not access to the
system root) to update any file in the subtree of utilities.
\f

This scheme offers basic facilities for propagating authorities
in the naming system. However, it lacks the ability to protect a
user against own accidental misuse (d_e_f_e_n_s_i_v_e_ protection). If a
user temporarily wants to write-protect (say) a file, he can only
do so by removing some of his own authorities, possibly never to
regain them. This is unacceptable, so we define another set of
attributes, p_r_o_t_e_c_t_i_o_n_s_, exactly equal to the -- not to mistake
for -- authorizations (viz. RETRIEVE, UPDATE, ADJUST, DELETE,
AUTHORIZE).

Protections may be altered at any time and are local to each
name.

1_0_._2_._5_ _ _ _ _D_e_l_e_t_i_n_g_ _o_f_ _F_i_l_e_s_ 10.2.5

Deleting of files is a non-trivial problem in the chosen catalog-
structure. The point is: A given file may be referenced by sev-
eral names. If the file is requested to be deleted through one of
the names, what should happen with the other names, referencing
the same file.

P_r_o_p_o_s_a_l_: The entry in question in the File Location Table is
marked "deleted", and the actual file is deleted ef-
fectively. Later references to the file will be stopped
(and notified) by the deleted FLT-entry.

The proposal solves the problem, but unfortunately the File Loca-
tion Table will grow indefinitely, since no entry may ever be ef-
fectively deleted and re-used.

P_r_o_p_o_s_a_l_: A counter is inserted into every FLT-entry. The counter
gives the number of undeleted names pointing at the en-
try. When the counter is zero, the entry may be re-used
for another file.

This proposal seems allright at first look, but it does have a
well-known, classical drawback (Knuth 1969).
\f

Deleting of a name is the same as deleting (at least) one edge in
the corresponding digraph. If -- after deletion -- one or more
vertices become isolated from the root of the original digraph,
they are unaccessible, and should therefore be effectively dele-
ted. Counting of references to a FLT-entry corresponds to count-
ing of the o_u_t_-degree of the referenced file -- in terms of the
digraph model of subsection 10.2.1.2. When the out-degree is
zero, the vertex is unaccessible, and must be deleted together
with all the edges that end in the vertex. One should think that
by recursive application of the method, the unaccessible parts of
the digraph will be deleted. Unfortunately that is not so in all
cases.

Consider the following example:

Figure 10.6: Example of cyclic digraph.

where C1 is the root.

If the edge eDd1Uu is deleted C2 and C3 will become unaccessible,
and should consequently be deleted. However, the above proposed
"garbage collection" algorithm cannot do the job: both C2 and C3
have outdegree equal to one.

The reason for this trouble is, that the algorithm does not work,
if the "garbage" contains c_y_c_l_i_c_ references!

It is extremely difficult to prohibit cyclic references without
at the same time restricting the digraph structure unnecessarily
(the digraph could for instance be restricted to a t_r_e_e_, ensuring
\f

acyclicity), since it is difficult to check a structure for
cyclic references.

The problem may be solved by either (1) restrictions on the
digraph structure, (2) check of acyclicity on file-creation time,
(3) check for cyclic references at file-deletion time, or (4) use
of an individual garbage collection utility program checking the
whole digraph structure.

A_n_ _e_x_a_m_p_l_e_ of a possible method for acyclicity enforcement:

A status variable called LEVEL is retained in every file of
type catalog. This variable denotes the level (i.e. the
l_o_n_g_e_s_t_ path of references from the root).

LEVEL for
MAINCAT: 0
CAT1: 1
CAT2: 2
CAT3: 3

Figure 10.7: Example

The LEVEL variable is easily computed when a catalog is
created (add one to the level of the mother-catalog).

When a new reference between two existing catalogs is
inserted (an entry "CATB" (say) -- referencing the catalog B
-- is inserted in catalog A (say)), it should be checked
that the LEVEL of the referenced catalog (catalog B) is
greater than (and not equal to) the LEVEL of the "mother"-
catalog (catalog A). If this condition is met no cyclic re-
ferences can be introduced.
\f

This example suffers from adding unnecessary restrictions to the
acyclicity restriction, too. Admittedly, every acyclic digraph
may be constructed under the restrictions of the example, but it
is very critical how the creations of inter-catalog references
are sequenced. Consider Fig. 10.7: the reference eDd3Uu m_u_s_t_ be
created before eDd4Uu.

Instead of implementing either a simple, but unnecessary restric-
tive, or a very complex acyclicity enforcement system, it is the
decision of the present design to leave such to augmenting levels
of software. That is, no acyclicity enforcement scheme is
implemented in the design.

This decision seems reasonable, remembering that the possible
amount of garbage is restricted to a number of File Location
Table entries.

1_0_._3_ _ _ _ _ _ _B_a_s_i_c_ _F_u_n_c_t_i_o_n_s_ _o_f_ _t_h_e_ _L_F_S_ 10.3

Since the LFS is augmenting the BFS-level in the hierarchical
structure of the file system, the LFS-functions handling files
resembles the corresponding functions of the BFS (cf. section
5.3).

The difference is that the file identifier supplied in the func-
tions of the BFS should be substituted by a symbolic filename in
the functions of the LFS.

The syntactical rules for naming of files -- as defined by the
restrictions on the function NAME (cf. section 10.2 -- could be
expressed like this in Backus-Naur notation:

<name'::= <catalogname' <d' <simple name'
<catalogname'::= <catalogname' <d' <simple name'
<simple name'::= "algol-like identifier"
<d'::= >.> >:> >/>
\f

However, it is not necessarily fortunate to express filenames as
a string of concatenated catalog- and file-names, in the a_r_g_u_-
m_e_n_t_s_ of LFS-functions.

The recursive definition above suggests a way to express a
filename in the functions of the LFS, namely:
(catalog _identifier, simple _name). Instead of using the complete
filename (and consequently interpreting it) every time a function
is called (e.g. OPEN ("cat1.cat2.cat3.pip", ...)), one may find
the file _identifier (the unique internal name, cf. subsection
5.2.1) for the "catalog-part" of the name string (that is
"cat1.cat2.cat3" in the above example), and just stating the
simple name ("pip") after this (OPEN (catid, "pip", ...)).

This way to name files on the LFS-level offers potent facilities
to augmenting levels. An augmenting naming system may for instan-
ce maintain a table of catalog _identifiers with an entry for each
user. Now the user only has to specify a simple name to reference
a file -- the naming system automatically supplies the
catalog _identifier of the user. The catalog _identifiers may be
initialized by means of user-supplied passwords, thereby imple-
menting a simple authorization enforcement scheme on u_s_e_r_s_.

A reasonable set of functions on the LFS-level is:

mount(volumename, device, cat _id)
dismount(device)

connect(cat _id, name, new _cat _id)
disconnect(cat _id)

lookup(cat _id, name, attributes, descriptor)

open(cat _id, name, mode, file _id)
close(file _id)

get(file _id, buffer, first _page, count)
put(file _id, buffer, first _page, count)
\f

create(cat _id, name, init _data)
delete(cat _id, name)

linkname(cat _id _1, newname, cat _id _2, oldname)
rename(cat _id, oldname, newname)

extend(cat _id, name, amount)
truncate (cat _id, name, amount)

set _tail(cat _id, name, tail)

set _authority(cat _id, name, authorizations)
set _inheritance(cat _id, name, inheritable _authorizations)
set _protection(cat _id, name, protections)

Please refer to chapter 6 for description of the functions.

1_0_._4_ _ _ _ _ _ _C_o_n_c_l_u_d_i_n_g_ _R_e_m_a_r_k_s_ 10.4

None of the objectives of section 10.1 are neglected.

The chosen catalog structure is compatible with the file system
of UNIX (Richie et al. 1974), in the sense that the interface of
UNIX may be simulated, provided that a proper set of restrictions
and supporting facilities are built on, in a superimposing level
to the LFS. Owing to the increasing popularity of UNIX and
related systems (Hanson 1980) this seems worthwhile to note if
considering standardization on the level of access interface.
\f

F_1_1_._ _ _ _ _ _ _ _R_E_M_A_R_K_S_ _O_N_ _I_M_P_L_E_M_E_N_T_A_T_I_O_N_ 11.

As stated in the introduction (section 1.1) the present report is
part of a development-project at A/S Regnecentralen af 1979. The
outcome of this project must be an actual implementation of the
designed file system.

However, it is not within the scope of this report to describe
implementation in detail, but a general outline of the ideas and
structure of a possible implementation will be given in this
chapter.

The intended implementation language is PASCAL80 (Staunstrup
1980).

1_1_._1_ _ _ _ _ _ _S_t_r_u_c_t_u_r_i_n_g_ _t_h_e_ _F_i_l_e_ _S_y_s_t_e_m_ 11.1

The individual levels of the design presented in chapter 4
are conseived as p_r_o_c_e_s_s_e_s_ (or incarnations of processes).

Figure 11.1: Processes of the file system, nested.
\f

Figure 11.1 shows the processes of the system. The processes are
-- in keeping with ordinary PASCAL80 principles -- conceived as
n_e_s_t_e_d_ processes.

However, it may be opportune -- in a not-so-distant future -- to
distribute the file system over a number of loosely connected
computers, each with its own private memory.

Fortunately such possibilities are inherent in the structure of
the implementation language PASCAL80, because all modularization
of programs is done by dividing the code into separate processes
(processes contain completely closed address spaces; the only
information that can be imported or exported from a process must
go through process p_a_r_a_m_e_t_e_r_s_ or through message s_e_m_a_p_h_o_r_e_s_).

Consequently the organization of fig. 11.1 is preferably restruc-
tured -- from the n_e_s_t_e_d_ organization to a "h_o_r_i_z_o_n_t_a_l_" organi-
zation of processes.

The horizontal organization of fig. 11.2 allows distribution of
the individual processes over a number of connected computers.

Figure 11.2: Processes of the file system, "horizontal" design.

1_1_._2_ _ _ _ _ _ _T_r_a_n_s_p_a_r_a_n_c_y_ _o_f_ _H_i_e_r_a_r_c_h_i_c_a_l_ _L_e_v_e_l_s_ 11.2

The hierarchical model of chapter 2 proved very convenient for
the design of the file system.
\f

However, in the actual implementation the structure need not be
too rigoristic.

It is not at all efficient if a procedural call to one level
simply results in an exactly identical call to the next level
(i.e. the level is transparent with respect to the requested
operation).

An example of such unwanted overhead is the call of "get". As
illustrated in section 2.2 this does not involve LFS or BFS (be-
cause open performs the operations of these levels). It would
indeed be ineffective if "get" were to be called through both LFS
and BFS.

The actual entry-points of the levels are illustrated in fig.
11.3.
\f

Figure 11.3: Interconnections in
the file system.

F_ 1_2_._ _ _ _ _ _ _ _A_ _P_R_E_L_I_M_I_N_A_R_Y_ _R_E_F_E_R_E_N_C_E_ _M_A_N_U_A_L_ 12.

1_2_._1_ _ _ _ _ _ _I_n_t_r_o_d_u_c_t_i_o_n_ 12.1

The present text describes a PASCAL80-implemented File System.

The File System is intended to administer the backing storage of
a computer system. This backing storage may consist of several --
possibly different -- backing storage devices, e.g. hard disks
and floppy disks.

It is the scope of this manual to give the reader an introductory
overview of the components and functions of the file system. Con-
sequently, descriptions of functions etc. are sketchy and by no
means sufficient.

The reader should refer to the future "System Programmes Referen-
ce Manual" for detailed description of the File System.

1_2_._2_ _ _ _ _ _ _C_o_m_p_o_n_e_n_t_s_ _o_f_ _t_h_e_ _B_a_c_k_i_n_g_ _S_t_o_r_a_g_e_ 12.2

1_2_._2_._1_ _ _ _ _V_o_l_u_m_e_s_ 12.2.1

The backing storage of a computer system consists of one or more
v_o_l_u_m_e_s_. Each volume resides on a d_e_v_i_c_e_. A given volume is not
necessarily tied to one particular device: volumes may be ex-
changed or replaced.

Each volume has a v_o_l_u_m_e_n_a_m_e_, given to it by the initialization
program formatting the volume. It is the responsibility of the
user to ensure that volumenames are unique.

A volume may comprise a maximum of 4 G (billion) pages.
\f

Figure 12.1: Example.
Volume 2, 3, and 4 are mounted on the devices 2, 3,
and 1 respectively. The volumes 1 and 5 are not
mounted (dismounted).

1_2_._2_._2_ _ _ _ _P_a_g_e_s_ 12.2.2

A volume consists of a number of pages. One page contains 512
bytes (octets). Pages may be a_l_l_o_c_a_t_e_d_ or f_r_e_e_ (i.e. not-in-use).

Free pages are not allocated individually: the smallest unit of
allocation is a s_l_i_c_e_ (consisting of a number of pages). The
slice-size may vary from volume to volume.

The free slices of a volume are administered by the B_i_t_ _T_a_b_l_e_, in
which every slice of the volume is represented by one bit (a
slice is free if the corresponding bit is set).

1_2_._2_._3_ _ _ _ _F_i_l_e_s_ 12.2.3

Each volume contains a number of f_i_l_e_s_.

A file is the basic facility for permanent storage of data.
\f

A file consists of a (logical) sequence of p_a_g_e_s_. The individual
pages (or rather: slices) of a file need not necessarily be
placed contiguously on the volume; they may be scattered in
c_h_u_n_k_s_ of slices.

Figure 12.2: Example

The administrative data concerning the whereabouts of the chunks
of a file are placed on the first page of the file. A file may
comprise a maximum of 25 chunks of arbitrary size. A file may
hold all the pages of a volume.

Figure 12.3: File organization.
\f

The first page moreover contains the d_e_s_c_r_i_p_t_o_r_ of the file. The
descriptor describes the status of the file in terms of length,
reservation locks, etc.

The first page of a file is called the d_e_s_c_r_i_p_t_o_r_ _p_a_g_e_ of the
file.

The locations of files are administered by the F_i_l_e_ _L_o_c_a_t_i_o_n_
T_a_b_l_e_ (FLT). This table contains the address of every file (i.e.
the address of its first page). No two table entries point to the
same file; that is: the table index of a given file is unambi-
guous within the volume. Therefore the FLT-index of a file to-
gether with the corresponding volumename is used as a unique in-
ternal name for that file.

The pair of data (volumename, FLT-index) is called a f_i_l_e_ _i_d_e_n_t_-
i_f_i_e_r_.

1_2_._2_._4_ _ _ _ _C_a_t_a_l_o_g_s_ 12.2.4

Users must be able to associate symbolic n_a_m_e_s_ with files.

Symbolic names -- together with the corresponding file
identifiers -- are stored in special files: c_a_t_a_l_o_g_s_.

There may be several catalogs on a single volume.

A symbolic name may reference a catalog, since symbolic names
represent file identifiers and catalogs are files themselves.
This facilitates hierarchical (or rather graph-represented)
naming systems.

A single file identifier may be designated by several different
names. A name may designate a file identifier on another volume.

A special catalog exists on each volume: the Rootcatalog. This
catalog is the "root" of the hierarchy of catalogs on that volume
(i.e. every catalog may be reached, starting at the Rootcatalog).\f

Whenever a new volume is mounted, the File System delivers a re-
ference to the corresponding Rootcatalog.

Figure 12.4: Catalog organization.

1_2_._2_._5_ _ _ _ _V_o_l_u_m_e_d_e_s_c_r_i_p_t_i_o_n_s_ 12.2.5

Every volume contains a special system file: the V_o_l_u_m_e_d_e_s_c_r_i_p_-
t_i_o_n_. This file holds data concerning the status of the volume
(slice size, volumename) etc.

Furthermore it contains the Bit Table and the File Location
Table.

To be able to interpret the datastructures of an arbitrary vol-
ume, the following convention regarding the Volumedescription
file is made: the descriptor page of the Volumedescription file
is always placed as the first page of the volume.
\f

This means that the Volumedescription file always can be read
without knowledge of the exact locations of File Location Table,
Rootcatalog, etc. It is the responsibility of the data in the
Volume-description file to point such datastructures out.

The volumedescription file contains an extra bit table -- besides
the Bit Table described in subsection 12.2.2. This bit table is
called the "Malfunctions Bit Table" (MBT), and it specifies all
slices in which persistent errors have been detected.

1_2_._3_ _ _ _ _ _ _F_u_n_c_t_i_o_n_s_ _o_f_ _t_h_e_ _F_i_l_e_ _S_y_s_t_e_m_ 12.3

Below is listed the functions of the File System nucleus. The
notation is casual but hopefully self-explanatory. Each of the
calls to the file system may result in an error condition. The
return from such a situation is for simplicity not represented.

1_2_._3_._1_ _ _ _ _V_o_l_u_m_e_ _H_a_n_d_l_i_n_g_ 12.3.1

Volumes may be mounted or dismounted:

MOUNT (VOLUMENAME, DEVICE, CAT-ID)
DISMOUNT (DEVICE)

DEVICE specifies the device on which the volume should be
mounted/dismounted. When a volume is mounted its name will be
checked against VOLUMENAME. The Rootcatalog of the mounted volume
will be referenced by CAT-ID upon return of MOUNT.

1_2_._3_._2_ _ _ _ _C_a_t_a_l_o_g_ _H_a_n_d_l_i_n_g_ 12.3.2

A catalog must be c_o_n_n_e_c_t_e_d_ ("opened") before it can be proces-
sed. When a user connects a catalog, he will be supplied with a
shorthand reference to this catalog. In fact this is what hap-
pens in the function MOUNT of the previous subsection: when a
volume is MOUNTed the Rootcatalog will automatically be connect-
ed, and its shorthand-reference delivered in CAT-ID.
\f

A user only knows catalogs through symbolic names (which in their
place are found in another catalog). Whenever a (simple) name is
specified, it must be accompanied by a reference to the catalog
in which to find the name.

CONNECT(CAT-ID, NAME, NEW-CAT-ID)

connects a new catalog. The user describes the catalog by a sym-
bolic name in NAME and a catalog reference CAT-ID in which to
find the name NAME. Upon return the catalog will be opened and
referenced by NEW-CAT-ID.

When the catalog has been processed it may be "closed" by means
of:

DISCONNECT(CAT-ID)

A specific name NAME contained in the catalog referenced by CAT-
ID may be found by means of:

LOOKUP(CAT-ID, NAME, ATTRIBUTES, DESCRIPTOR)

This function returns the attributes of the found name, and the
descriptor of the file

1_2_._3_._3_ _ _ _ _F_i_l_e_ _H_a_n_d_l_i_n_g_ 12.3.3

When a file is to be processed it must be opened. After proces-
sing it may be closed (to release tied-up resources).

OPEN(CAT-ID, NAME, MODE, FILE-ID)
CLOSE(FILE-ID)

MODE denotes the wanted use of the file. It may be either RETRIE-
VE, UPDATE, or both. Furthermore it must be specified whether
PRIVATE or SHARED access is wanted. The file is designated by
(CAT-ID, NAME). A shorthand reference to the file is returned in
FILE-ID.
\f

The contents of the file referenced by FILE-ID may be processed
by:

GET(FILE-ID, BUFFER, FIRST-PAGE, COUNT)
PUT(FILE-ID, BUFFER, FIRST-PAGE, COUNT)

which transfer a number of pages. FIRST-PAGE denotes the first
page to be transferred to or from the file, and COUNT the number
of subsequent pages. Pages are put or taken from BUFFER.

New files may be created:

CREATE(CAT-ID, NAME, INIT-DATA)

A file with the simple name NAME -- which is inserted into the
catalog referenced by CAT-ID -- is created with initialization
data (for attributes and descriptor) as in INIT-DATA. The file
may be a catalog.

A file is deleted by

DELETE(CAT-ID, NAME)

When DELETE is called, the designated file is marked for
deletion. The file is not actually deleted, however, until all
connections to the file has been CLOSEd.

A catalog-file cannot be deleted, unless it is empty.

All other names referencing the same file will -- after deletion
of the file -- be referencing an e_m_p_t_y_ entry in the File Location
Table.

A new name may be linked to an already existing name by means of:

LINKNAME(CAT-ID-1, NEWNAME, CAT-ID-2, OLDNAME)
\f

A new simple NEWNAME is inserted into the catalog designated by
CAT-ID-1. The name will reference the same file as OLDNAME of the
catalog designated by CAT-ID-2.

A name may be changed (within a catalog) by means of:

RENAME(CAT-ID, OLDNAME, NEWNAME)

Finally a file may be extended or truncated:

EXTEND(CAT-ID, NAME, AMOUNT)
TRUNCATE(CAT-ID, NAME, AMOUNT)

AMOUNT is the number of s_l_i_c_e_s_ to allocate/deallocate.

Finally -- as a service function to augmenting file system levels
-- following function allows arbitrary data to be written into
the descriptor of a file:

SET-TAIL(CAT-ID, NAME, TAIL)

where TAIL contains the data to be inserted into the descriptor.
The actual format and length of TAIL is implementation dependent.

1_2_._3_._4_ _ _ _ _A_u_t_h_o_r_i_z_a_t_i_o_n_ _E_n_f_o_r_c_e_m_e_n_t_ 12.3.4

The f_i_r_s_t_ _l_e_v_e_l_ of authorization enforcement is the naming sys-
tem: users can only access the files they can name (if a user
only knows one catalog reference, he can only access the subtree
(or rather: subgraph) of files of which this catalog is a root).

The s_e_c_o_n_d_ _l_e_v_e_l_ of authorization enforcement concerns the
individual symbolic names of files: associated with each simple\f

name is a set of authorization-attributes. The total set
comprises:

Authorization to:
- RETRIEVE (data)
- UPDATE (data)
- ADJUST (length of file)
- DELETE (file)
- AUTHORIZE (i.e. redefine the set of authorizations)

The set of authorizations for a given name defines the allowed
use of the corresponding file.

When a file (possibly catalog) "FILE" (say) is OPENed
(CONNECTED), the set of authorizations associated with the file,
is computed as the i_n_t_e_r_s_e_c_t_i_o_n_ of the two sets associated with
(1) the name "FILE" and (2) the catalog in which "FILE" was
found.

If AUTHORIZE is contained in the set of authorizations for a
name, the authorizations may be redefined by:

SET-AUTHORITY(CAT-ID, NAME, AUTHORIZATIONS)

When a new name is linked (by LINKNAME) to an existing name, it
will inherit a s_u_b_s_e_t_ of the authorizations of the existing name.
This subset of "inheritable" authorizations may for a given name
be defined by:

SET-INHERITANCE(CAT-ID, NAME, INHERITABLE-AUTHORIZATIONS)

The t_h_i_r_d_ _l_e_v_e_l_ of authorization enforcement is totally local to
each name. As a means to protect a name against accidental mis-
use, a set of p_r_o_t_e_c_t_i_o_n_s_ is associated with each name. The set
of protections comprises exactly the same elements as the total
set of authorizations (viz. RETRIEVE, UPDATE, ADJUST, DELETE,
AUTHORIZE). Whenever a function is performed via a given name, it
is not only tested against the authorization of the name, but
against the intersection of both authorizations and protections.
\f

The difference between authorizations and protections is twofold:

- protections have no influence whatsoever on inheritance of
authorizations.
- protections may be redefined unconditionally at any time.

Thus, protections are a t_e_m_p_o_r_a_r_y_ guard against a_c_c_i_d_e_n_t_s_.

The protections of a given name is redefined by:

SET-PROTECTION(CAT-ID, NAME, PROTECTIONS)

F_ 1_3_._ _ _ _ _ _ _ _C_O_N_C_L_U_S_I_O_N_ 13.

The report has shown a design of a file system nucleus offering
basic (extendable) but potent facilities.

The emergence of high-level system programming languages makes it
an attractive and practicable task to design a file system suited
for use on even very different computers. The advantages are two-
fold: portability of files between different computer systems,
and rationalization of implementation.

It is the intention that the facilities of the file system nu-
cleus should be extended by augmenting systems. Differing compu-
ter systems may thus extend in different directions, but still
preserve compatibility.

The objectives of the presented file system are chosen according-
ly.

Special emphasis has been placed on the subjects of naming and
support for authorization enforcement (i.e. the Logical File Sys-
tem), providing the file system nucleus with such basic facili-
ties, which augmenting systems may extend to a potent interface
in a multi-user environment.

It is worth noting that the (Logical) file system interface may
be restricted (all within the general philosophy of the design)
to simulate the interface of the UNIX file system (Ritchie et al.
1974).

It is the opinion of some that UNIX -- or something closely
resembling UNIX -- will be chosen as a standard for the environ-
mental file system interface of the currently developed system
programming language ADA. Since this language may come to be the
general standard of the field, it seems reassuring that the pre-
sent file system may be modelled to adhere to such standards.

F_ A_._ _ _ _ _ _ _ _ _R_E_F_E_R_E_N_C_E_S_ A.

Aris, T.A. and B. Tveden-Jørgensen: RC8000 Monitor, part 2.
Reference Manual, RCSL No 31-D 477, A/S Regnecentralen,
Copenhagen 1978

Brinch Hansen, P.: RC4000 Software Multiprogramming System, A/S
Regnecentralen, RCSL No 55-D 140, Copenhagen, February 1971.

Brinch Hansen, P.: The Architecture of Concurrent Programs,
Englewood Cliffs NJ, Prentice-Hall, 1977

Hnatek, E. R.: Semiconductor Memory Update - Part 1, Computer
Design p. 67, Dec. 1979

IBM Diskette, The. General Information Manual. GA 21-9182-2, not
dated.

IBM System/38. Technical Developments, IBM 1978

Jensen, K. and Niklaus Wirth: PASCAL, User Manual and Report,
Springer-Verlag, New York 1975

Knuth, D. E.: The Art of Computer Programming, Addison-Wesley,
1969

Kristensen, B. B. et al.: BETA Language Proposal. DAIMI PB-98,
June 1979

Kristensen, Karl Aage: Filsystemer. Speciale nr. 59 ved DIKU,
1979

Lampson, B.W.: Redundancy and Robustness in Memory Protection IFIP
74, p. 128-132, 1974

Madnick, S. E. and Alsop, J. W.: A Modular Approach to File
System Design, SJCC 1969b
\f

Madnick, S. E.: Design Strategies for File Systems: A Working
Model. Selected Papers from FILE 68, p. 117
Amsterdam 1969a

Olsen, J.C.: RC3600 Catalog System, System Programmer>s Guide, RCSL
No 43-GL 7915, A/S Regnecentralen, Copenhagen 1978

Ravn, A. P.: I/O in Operating Systems - Principles and
Implementation (draft). DIKU 1980

Ravn, A. P.: Noter til kursus: Ind- og Udlæsning på
Operativsystemniveau, DIKU 1979

Redell, D.D. et al.: Pilot: An Operating System for a Personal
Computer. CACM, vol. 23, No 2, February 1980

Ritchie, D.M. et al.: The UNIX Time-Sharing System, vol. 17,
No 7, July 1974

Staunstrup, Jørgen: Pascal80 Report. A/S Regnecentralen af 1979,
RCSL No 52-AA 964, 1980

Staunstrup, J. og S. M. Sørensen: PLATON, A High Level Language
for System Programming, RECAU, R-75-59, May 1975

Stoy, J.E. and C. Strachey: OS6 - An Experimental Operating System
for a Small Computer, part 1: CACM, vol. 15, No 2, part 2:
CACM, vol. 15, No 3, June 1972

Teorey, T.J. and T.B. Pinkerton: A Comparative Analysis of Disk
Scheduling Policies. CACM, Vol. 15, No 3, pp. 177-184, March
1972

Wegner, P.: Programming with ADA: An Introduction by Means of
Graduated Examples. SIGPLAN Notices vol. 14, No 12, December
1979

Wirth, N.: MODULA: a Language for Modular Multiprogramming,
Software Practice and Experience 7, 1, pp. 3-55, 1977

Wong, C.K.: Minimizing Expected Head Movement in One-Dimensional
and Two-Dimensional Mass Storage Systems. Computing Surveys,
vol. 12, No 2, June 1980
\f

«eof»

DataMuseum.dk

RegneCentralen RC850

⟦fac708cd9⟧ TextFile

Derivation

TextFile