|
DataMuseum.dkPresents historical artifacts from the history of: DKUUG/EUUG Conference tapes |
This is an automatic "excavation" of a thematic subset of
See our Wiki for more about DKUUG/EUUG Conference tapes Excavated with: AutoArchaeologist - Free & Open Source Software. |
top - metrics - downloadIndex: T q
Length: 42412 (0xa5ac) Types: TextFile Names: »q-design.tex«
└─⟦2d1937cfd⟧ Bits:30007241 EUUGD22: P.P 5.0 └─⟦35176feda⟧ »EurOpenD22/isode/isode-6.tar.Z« └─⟦de7628f85⟧ └─⟦this⟧ »isode-6.0/doc/manual/q-design.tex«
\chapter {Overview} \section {Introduction} This part of the manual describes aspects of the design of QUIPU which are not needed to be known by the administrator or user of QUIPU. However, it documents important design decisions and protocol, which are of interest to understand how QUIPU works, and in some specific circumstances (e.g., solving interoperability problems). A summary of the main features of QUIPU is also given. QUIPU fully implements both of the OSI Directory Protocols, and a number of extensions. The highlights of the QUIPU Directory Service Implementation are: \begin{itemize} \item Use of memory structures to provide fast access, without use of complex keying techniques. \item Activity scheduling within the DSA to allow for multiple accesses. \item General and flexible searching capabilities. \item A mechanism to provide non-local access control. \item A mechanism to provide external schema management. \item A sophisticated approach for management of distributed operations and replication. \end{itemize} The current implementation provides a DSA, and a procedural interface to the Directory Abstract Service and the associated Directory Access Protocol (DAP), which will enable other applications to use the Directory. \section {General Aims} To understand the rationale behind some of the decisions, it is useful to consider the original aims of the QUIPU project. These can then be mapped onto a number of more technical considerations: \begin{itemize} \item To produce an implementation which followed the emerging standards. This is an aim in itself. \item Flexibility, to enable the system to be used for experimentation and research into problems relating to directory services. \item To provide a vehicle for experimentation in the area of distribution and replication. \item To provide some level of real usage. This sort of work is useless if entirely confined to the laboratory. It is important that it is capable of use for some level of experimental service. However, it is not consciously designed to evolve into a full fledged product. \end{itemize} As the work has evolved, the following goals have emerged as additional to the original ones listed above: \begin{itemize} \item To provide a public domain the OSI Directory implementation as a part of the ISODE package. \item To provide integrated support for the ISODE Applications. \item To be used as a part of the initial pilot Directory Service in the UK Academic Community and in other pilots. \end{itemize} \section {Technical Goals} The major goals of the QUIPU Directory Service are: \begin{itemize} \item Full support of the Directory Access Protocol and Directory System Protocols \cite{CCITT.Directory}. \item Support of the majority of the service elements specified in the OSI Directory. \item Full interworking with other OSI Directory implementations. \item Very full searching and matching capabilities, beyond the minimum required by the OSI Directory. \item Provision of a system which has potential for very high distribution. \item Support of distributed operations in a manner which is full conformant with respect to non-QUIPU systems, and provides additional functionality for QUIPU systems. \end{itemize} The following areas were not intended as goals in the initial system. Some discussion is given as to how these areas might be tackled in future versions. \begin{itemize} \item The QUIPU Directory is not intended for very large scale systems (i.e., Millions and tens of Millions of entries per DSA or hundreds of megabytes of data per DSA). \item Substantial data robustness is not required: there is no need to employ complex data backup techniques, such as replicated hardware. \item The security aspects of the OSI Directory were initially omitted, as not required by the general aims. At this point, there is no reason why this aspect should not be integrated. \end{itemize} \section {Further QUIPU documents} The following documents are available, in addition to this manual: \begin{itemize} \item A paper on the original design, which is mainly of historical interest \cite{inca-paper}. \item A paper presented at the 1988 IFIP 6.5 conference, which gives a general overview \cite{QUIPU.IFIP}. \item A paper presented at Esprit Conference Week 1988, which describes the distributed operations \cite{QUIPU.Distributed}. \item A paper presented to the Dutch \unix/ User Group in November 1989, which describes how Quipu DSAs navigate the DIT \cite{QUIPU.Navigate}. \end{itemize} These papers, except the first, are distributed online with QUIPU. \chapter {General Design} \section {Overview} This chapter describes general decisions. In particular, issues relating to use of the OSI Directory are covered, rather than system implementation decisions. However, the two are somewhat bound up. Attention is drawn to the protocol extensions defined in section \ref{edb-ros}. Note that this does {\em not} affect interactions with non-QUIPU DSAs (or DUAs). The following aspects of the OSI Directory are not handled in QUIPU 6.0. \begin{itemize} \item The protocol elements for support of directory use of authentication are handled in a conformant manner, but the associated services are not available to the end user. \item Search is always supported by multicasting. This does {\em not} affect the basic service offered to the user, but means that prohibition of chaining is not possible in all cases. \item Partial Outcome Qualifier is not supported for List. \item There are some aspects of distributed operation, where interaction with another conforming system would not be fully general. In particular, QUIPU might not be able to be configured with references to point at a complex configuration where not all sibling entries are held. \end{itemize} Otherwise, QUIPU 6.0 is believed to conform to the standard. \section {Service Controls} QUIPU use of service controls conforms to the OSI Directory. Comments are made on those controls where QUIPU makes a choice with respect to some option given by the OSI Directory. \begin{description} \item [preferChaining] Chaining will be done. \item [chainingProhibited] Chaining will not be done. For some cases of the search operation, this means that the QUIPU Directory Service will not be able to provide the service, and will return a ``chaining required'' error. \item [localScope] The scope will be restricted to the DSA concerned (i.e., no chaining will be done). \item [dontUseCopy] If this is set, the master data will be used. This may have a significant impact on performance for operations on entries which are high up the tree and for the DSAs which master this information. These issues need study. \item [dontDereferenceAliases] Followed as per the OSI Directory. \item [priority] This is used to help control scheduling within the DSA. High priority tasks are dealt with before low priority tasks (Note: there are no checks here, so this is open to mis-use !) \item [timeLimit] Followed as per the OSI Directory. \item [sizeLimit] Followed as per the OSI Directory. \item [scopeOfReferral] The OSI Directory is followed, although QUIPU does not make use of this control. \end{description} \chapter {Distributed Operation} \label {dist-op} \section {Overview} Distributed Operation is a major aspect of the QUIPU Directory Service Sadly, the OSI Directory specifications in this area are, in the author's opinion, rather unsatisfactory. Therefore, the QUIPU distributed operations are described in a slightly different manner. The concept of ``Naming Context'' is not used, and the significance of ``Knowledge'' is de-emphasised. The external view of this functionality is fully in line with the standard. Some of the concepts defined in this chapter do {\em not} correspond to the ISO/CCITT terms, and so new terminology is introduced. Standard terminology is used in the standard way. \section {DSA/DUA Interaction Model} There are some interesting choices to be made between DSA Referral and Chaining. A DUA will start by contacting a local DSA, specifying that chaining is preferred (i.e., DSA referrals should not be passed back to the DUA). After that, the first DSA will proceed by use of DSA Referral, except for operations where this is not possible in the QUIPU framework (some cases of search). The advantages of this approach are: \begin{itemize} \item The DUA code can be kept to a minimum, as there is no need to handle referrals. This does mean that the DUA must always interact with the Directory Service through a DSA which supports chaining (which might exclude some implementations). \item Always going thorough a local DSA allows a ``per system'' cache to be maintained in a coherent manner. \item The overhead of maintaining chained connections is not passed on too far. \end{itemize} Note that whilst the DUA procedural does not handle referrals transparently, they are defined in the service interface, so that an application can choose to handle the referrals directly if it wishes to do so. \section {Model of Data Distribution} \label {model-dist} This is a critical section of the design. It is essential to understand it before studying distributed operation. \subsection {Entry Data Blocks} For the root and every non-leaf vertex, there will be an {\em Entry Data Block} (EDB) which contains {\em all} information pertaining to the next level down in the DIT. Figure~\ref{edbf} gives and ASN.1 description of the Entry Data Block format. \tagrind [htbp] {edb}{Entry Data Block Format}{edbf} It should be noted and remembered that the Entry Data Block associated with an entry and described as ``the Entry Data Block of the entry'' contains the entries children (i.e., their attributes and RDNs) and not the attributes of the entry itself. This is {\em not} necessarily intuitive. \subsection {Masters and Slaves} Every Entry Data Block has {\em Master} and {\em Slave} copies. There will typically be only one master (although there may be multiple master copies, where data is maintained ``out of band''). Slave copies are automatically replicated from a Master EDB. This may be direct or indirect. The full propagation path must be acyclic (loop free). A DSA has either all or none of an Entry Data Block as a Master or Slave (viz: Entry Data Blocks are atomic). Any other DIT information it contains is treated as cached data. A DSA does not need to have any Master or Slave data. DSAs are named, and represented in the DIT. One of the reasons for this, is to enable use of the Directory to identify the OSI location of DSAs. This OSI location can then be adjusted transparent to the logical mapping of the DIT onto DSAs. This can be seen as treating a DSA in the same manner as any other Application Entity. This simplifies the implementation, as there does not need to be specific storage of additional configuration information (knowledge). An important piece of information stored in the entry of each DSA is the list of EDBs and how they are replicated. This information is the basis for automatic replication. \subsection {QUIPU Subordinate References} An entry has associated with it, attributes which indicate the DSAs which have Master or Slave Entry Data Blocks for the entry in question. These pointers are known as {\em Quipu Subordinate References} (QSR). For every QSR, the DSA must have a Master or Slave copy of the EDB, as implied by the QSR. The converse it not true: DSAs may have copies of EDBs without there being QSRs. The DSAs with QSRs have the information {\em and} are prepared to answer public queries about the Entry Data Block in question. DSAs with EDBs (typically slave copies) and no QSRs usually have copies for performance or robustness reasons. Quipu Subordinate References are similar to the standardised Non-Specific Subordinate References (NSSR). There are the following differences: \begin{itemize} \item QSRs admit to replication, and therefore there are Master QSRs and Slave QSRs. \item A QSR always points to all of the relevant information, whereas an NSSR may only point to a part of it and must be used in ``and'' conjunction with other NSSRs. \end{itemize} \subsection {Access to the root EDB} There is no requirement for a given DSA to hold a copy of the root Entry Data Block. However, to be able to systematically process all queries, there must be direct or indirect access to the root Entry Data Block. Therefore, every DSA which does not have a copy of the root Entry Data Block must know the name and address of one or more DSA which either has a copy of the root Entry Data Block, or is closer (in terms of these references) to a DSA which has a copy. This approach is similar to, but not the same as, use of superior references defined in the standard. \section {Standard Knowledge References} In addition, to the QUIPU specific QSRs, an entry might also contain standard Knowledge References, as defined in the OSI Directory. This is used to point to data not contained in a QUIPU DSA. There are three types of reference defined in the standard: \begin{description} \item[Subordinate Reference] Pointer to an Entry \item[Cross Reference] From the QUIPU standpoint, the same as a subordinate reference \item[Non Specific Subordinate Reference] Pointer to multiple subordinates which must be queried for the next level down. QSRs are similar to this (see previous section). \end{description} In the first two cases, the entry in the Entry Data Block is considered to be ``Knowledge only'' (although other entry information may be cached). In the third case the entry will also have full information on itself. If any of these are present, there will be no QUIPU master or slave pointers. These three types of pointer are mutually exclusive\footnote{Thought(SEK) --- does the standard let you have a subordinate reference plus a cached NSSR?}. These attributes are not supported in QUIPU 6.0. \section {Navigation} Given this data model, a straightforward navigation algorithm can now be specified. The requirement is to locate the entry associated with a specified Distinguished Name. When the entry is arrived at, the operations will behave as proscribed. The basis of the navigation strategy is that the first DSA (i.e., the one accessed by the DAP) does all of the hard work. Other DSAs, accessed by DSA Referral, either answer the question or return an error. This is important, as it is the basic strategy by which the system ensures completion of queries. There are times when the DSA may depart from this model, these are discussed in Section~\ref{DSA:select} and \cite{QUIPU.Navigate}. First consider the behaviour of a DSA accessed by the Directory System Protocol (DSP): \begin{enumerate} \item Look up the Distinguished Name. \item If the Distinguished Name is found, go to step 6. \item If there is a local copy of the Entry Data Block of the parent, return a nameError. The ``matched'' parameter should be set to the Distinguished Name of the Entry Data Block (i.e., the level above the offered name). This is an authoritative NO. \item Strip the lowest component off the Distinguished Name, and go to step 1. If there are no more components, go to step 5. This process checks for authoritative NO. \item At this point, the name has not been found, and no relevant Entry Data Block has been found. This implies that the DSA does not hold the root Entry Data Block. Therefore the DSA should return a DSA Referral. The DSA Referral should be the list of DSAs (names and addresses) which are known to be closer to the root. \item We now have an entry which matches some or all of the original Distinguished Name. Consider this entry. \item If the entry contains an Alias attribute, dereference, and goto step 1. Note that if a referral is returned, that the appropriate parameters should be set to indicate all dereferences. \item If the entry is the one specified in the query, return the answer to the query, or the appropriate (authoritative) error. \item If the entry is of object class ``QuipuNonLeafObject'', return a Referral. This is simply a redirect to a DSA which can take the query at least one step further. The names for the DSA Referral should be taken from the master and slave Quipu Subordinate References. Where the calling DSA is a non-QUIPU DSA, the Presentation address of the Master DSA must be looked up, and only this one returned. \item If the entry contains a reference, the appropriate referral should be returned. \item The query refers to an entry below the bottom of the DIT. An authoritative nameError can be returned. \end{enumerate} Now consider the slightly more complex case of the initial DSA (doing the DSA Referral). Steps 1-4 are followed as above as above, to determine authoritative NO. \begin{enumerate} \setcounter{enumi}{4} \item At this point, the name has not been found, and no relevant Entry Data Block has been found. This implies that the DSA does not hold the root Entry Data Block. The list of DSAs which are known to be closer to the root, are the starting point for the iterative query. Go to step \ref{step-iter}. \item We now have an entry which matches some or all of the original Distinguished Name. Consider this entry. \item If the entry contains an Alias attribute, apply the relevant dereference to the original query, and go back to the start. \item If the entry is the one specified in the query, return the answer to the query, or the appropriate (authoritative) error. \item If the entry is of object class ``QuipuNonLeafObject'', this gives a list of QSRs to start the iterative query. Go to step \ref{step-iter}. \item If the entry contains a standard knowledge reference, then go to step \ref{step-iter}. Note that for non-specific subordinate references, {\em all} of the references must be followed before giving up. \item The query refers to an entry below the bottom of the DIT. An authoritative nameError can be returned, \item \label{step-iter} Select one of the DSAs from the referral list. The order to try the DSAs is arbitrary. However, it is attempted to select ones with the topologically closest name first (e.g., a UK DSA will prefer to query another UK DSA before asking a French one). Try DSAs in turn until one gives an answer or you get bored. Consider the answer. Authoritative answers (yes or no) should be passed back to the DUA. If a Referral is received, recurse to step, watching carefully for loops. \end{enumerate} It can be seen that this navigation is relying on data being distributed correctly, and DSAs other than the one doing the work behaving in a correct manner. Information provided in the referral should be used to ensure that the iteration is progressing, and thus detect livelock situations. \section {List} The Entry Data Block concept allows the list operation to fall out in a straightforward manner. Navigation to the Entry Data Block belonging to the name provided, will give access to the full result for the list operation. Note that where cross/subordinate references are involved, it will be assumed that these are not alias entries (reasonable in practice). This will allow list to be performed in a single DSA in all cases. \section {Search} This section describes how the OSI Directory search is supported. This is one of the hardest parts of the implementation, and care must be taken. Note that the DAP argument in DSP is always that provided by the DUA\footnote {Whilst this is in principle one of the key aspects of the way the DSP works, the recommendations for distributed operations violate this principle when dealing with aliases during search.}. This means that the work done by a given DSA must be in relation to the target object. With other operations, this is (fairly) straightforward, as the target object always references the base object of the operation. For searching, care must be taken to correctly verify whether the base object has been reached. This is done by use of the operation progress information, setting the name resolution phase to completed. The search operation functions by searching the part of the tree implied by the ``subset'' specification. Rather than returning all of the information, the queried DSA will apply the associated filter to the entries in question, and return the filtered result, along with appropriate continuation pointers. This should minimise network traffic. The case of ``subset = baseObject'' is handled by navigating to the Entry Data Block of the object's parent, and applying the given filter. If the entry is a cross reference or subordinate reference, the reference should be followed (using the same query). The case of ``subset = oneLevel'' is handled by navigating to the object's own Entry Data Block. There the filter is applied to each of its children. If any of the children are alias entries, the alias should be de-referenced, and a baseObject search applied to the new entry. For each child which is a cross references or subordinate references, the references should be followed, setting the target object to be the child. The case of ``subset = wholeSubtree'' is handled by navigating to the object's own Entry Data Block. There, the filter is applied to the object and to each of its children. If any of the children are alias entries, the alias should be de-referenced, and a wholeSubtree search applied to the new entry. For each child which has QUIPU children (determined by the prescence of a masterDSA attribute), the search should be applied to the master or one of the slave DSAs, with target object set to the child. For each child which is a cross reference or subordinate reference, the references should be followed, setting the target object to the child. For each child which has non-specific subordinate references, the search should be applied to {\em all} of the referenced DSAs, with the target object set to the child. There are three procedures for operating: \begin{enumerate} \item Everything handled by the first DSA, with other DSAs returning a mixture of results and partial outcome qualifiers. \item Proceed by referral until a DSA is reached which has a copy of the base object. Then this DSA proceeds by referral, and returns the full result to the first DSA. This is how QUIPU 6.0 works. It has the advantage of often accumulating search results in a local environment, and so is selectable (possibly in a complex manner). \item Proceed by referral until a DSA is reached which has a copy of the base object. Then proceed entirely by chaining. This is not done. \end{enumerate} There is potential for looping in this procedure. This will be detected and broken by noting loops in the DSA trace information. This takes account of the fact that some distribution will allow for a query to re-enter the same DSA a number of times. The Search and list operations make use of the ``partial results'' functionality to return information if a time or size limit is reached. Thus, setting a low size limit will allow a user to easily examine sample information (either by list or search). \section {Selecting a DSA} \label{DSA:select} In Quipu-5.0 the chain/refer choice was very ad hoc. For a DSA, the best choice is usually to give a referral, except where this will not work. To make this calculation, it is necessary to determine if two DSAs can communicate directly. This is done by deriving from the presentation address of a DSA, a list of connected networks. This can then be used to determine if a pair of DSAs can communicate directly, and is the basis for the chaining/referral choice. This will need the extension of the network address, to allow encoding of private networks other than the three well known ones. There is an analogous problem when a DSA needs to access a DSA which cannot be accessed directly. Each DSA which does not have full connectivity, will have an attribute which indicates network/DSA pairs. This indicates a DSA which may be used (bilateral agreement) to access a given network by application relay. The following Sections discuss briefly how these choices are made, \cite{QUIPU.Navigate} describes the process more fully. \subsection {DSA Quality} Replication gives a choice of DSA to ask a given query to. The following criteria are relevant: \begin{itemize} \item Use an existing association if possible \item Prefer a QUIPU DSA \item Prefer a reliable DSA \item Prefer a ``local'' DSA \end{itemize} The first two are straightforward. A local DSA can be selected using the following\ldots \begin{enumerate} \item Prefer to use networks in the order specified by ts\_communities. This will encourage access over a local ether/preferred net. \item Pick a DSA with a close name (e.g., prefer one in the same country) \item Pick a DSA in the same DMD. This would need to add a DMD attribute to the DSA entry (encoded as DN) [Not yet implemented]. \end{enumerate} Picking a reliable DSA is achieved using the following information \begin{verbatim} DSAInfo ::= SEQUENCE { dsa DistinguishedName, lastAttempt UTCTime, lastSuccess UTCTime OPTIONAL, failures-since-last-success INTEGER } \end{verbatim} Thus a DSA will be able to check if it knows about the DSAs in question, and can make a choice based on past results. This should operate dynamically, without operator interference. Info on DSAs not contacted for a given period should be expunged. It is hoped to store this as an attribute of a DSA in future versions of Quipu. \subsection {Unavailable DSAs} In the case where a DSA is unavailable, and chaining is preferred, a reference will be returned by a QUIPU DSA. A QUIPU DUA, which knows it is talking to a QUIPU DSA can rely on this behaviour, and simply use the referral as a diagnostic to the user. It is hoped that the next version of the standard will add an obvious extra parameter. \subsection {Operating When DSAs are not Fully Interconnected} \label {tcp-only} Whilst global interconnection of all application entities is an OSI ideal, it will not be achievable in the short or medium term. Application relaying by DSAs will be needed to achieve full directory connectivity. In general, it is not desirable for DSAs to proceed by chaining - it wastes unnecessary application level resources. Later, there may be policy reasons to prefer chaining, but these are ignored for now. The internal structure of network addresses allows a DSA to determine if two DSAs can communicate directly. \section {The External View of QUIPU} To a non-QUIPU system, QUIPU will appear to work exactly according to the standard. When a QUIPU DSA interacts with another DSA, it will look up its object classes (and probably other information). This will allow it to determine if the other DSA is a QUIPU DSA. When interacting with another QUIPU DSA, the following deviations from the standard are possible. These are primarily concerned with the introduction of replication: \begin{itemize} \item Presentation Address might be omitted from Access Point (always present in QUIPU 6.0, for consideration in QUIPU 7.0). \item Cross References and Subordinate References have multiple values (although QUIPU will probably never send these to itself). \item Multiple values of Non Specific Subordinate Reference are assumed to have OR conjunction (i.e., they are really QSRs). \item Use of QUIPU Access control as described in Section~\ref{dsp-acl} \end{itemize} When a QUIPU DSA returns references which are derived from reference attributes, it will return them as specified. If it returns information derived from QUIPU internal pointers, it will return a non-specific subordinate reference. If the DSA being communicated with is not a QUIPU DSA, it will return only a reference to the a selected DSA (as replication is not admitted within the protocol). \section {Cached Data} \label {cache-all} Cached data is not mentioned in the basic algorithm. However, the algorithm can utilise cached data in some circumstances. This is because of the manner in which identification of copy data has been introduced in the final stage of the OSI Directory specification. Cached data may be used whenever this is not prohibited by the service controls. The standard does not clearly define what ``copy'' data is. In general, QUIPU treats master and slave data as authoritative. Both slave and cached data are returned to the user as ``copy'' data. For this reason, the distinction between slave and copy data can only be internal to QUIPU. There is no time to live or age information in the OSI Directory Protocols. Care must be taken when caching, that spurious information is not passed around indefinitely between DSAs. When QUIPU holds cached data, it will notes how long it has had it, and will ``time out'' the data after a tailorable period. This section is open ended. The exact approaches to caching, and determining suitable timeout values will be the subject of experiment. The important thing about managing cached data is to handle timeouts sensibly. Data cached from a cache may have an indeterminate age. It is important that this data is given a relatively short timeout, to prevent it being circulated indefinitely amongst a set of DSAs. However, {\em if} the data can be verified by usage (e.g., correctly connecting to a DSA verifies a presentation address), it should then be treated as if cached from a master/slave. Non-verified data should be treated in the same manner as user data, which is described in the next section. Data cached from master/slave information should be given a longer timeout. Data is discarded, rather than refreshed automatically. A timeout of some number of days seems appropriate for most data. \section {Configuration and Slave Update} A given DSA will have copies of zero or more Entry Data Blocks. A DSA may either be a master for a given Entry Data Block, or a slave. If there are multiple master copies, it is assumed that these are kept coherent by some out of band mechanism. For example, one of them is the ``real'' master, and the others are updated by file transfer when modifications occur. This discussion will proceed for the single master case. There are three distinct types of DSA: \begin{enumerate} \item A DSA with a master copy of the root Entry Data Block. \item A DSA with a slave copy of the root Entry Data Block. \item A DSA with no copy of the root Entry Data Block. \end{enumerate} As noted in Section~\ref{model-dist}, DSAs of type 2 and 3 will have pointer(s) to a DSA which is ``closer'' to the master copy of the root Entry Data Block. Specifying this hierarchical distribution, as opposed to requiring direct access to the master (as in earlier versions of the OSI Directory) means that there can be many copies of information which needs to be highly replicated, without excessive redundant copying across the Wide Area Network. This will be particularly important for the root Entry Data Block. DSAs of type 2 will only need the pointer information for initial startup or recovery after catastrophic corruption. When the slave copy of the root Entry Data Block has been obtained for the first time, this will supercede the pointers. DSAs of type 3 will usually use cached information in preference to these pointers, and will only need the pointers if cached information is (or appears to be) invalid. The only information which a DSA has to obtain locally at initial boot time, other than the DSA pointers, is its own name. All other information may be obtained from the Directory. Beyond this, the Directory Service manages its own configuration. There is little point in having a Directory Service providing general high speed access to global information, and then requiring an additional system (knowledge) to deal with its own configuration. Associated with the DSA's entry in the DIT is a specification of the entries for which the DSA is a master, and for which it is a slave. A DSA will be able to derive the location of an Entry Data Block for which it is master from this information. Thus at initial boot, a DSA will utilise its initial DSA pointers to read its own entry. The location of master Entry Data Blocks will be derivable from their name, and so their existence can then be verified by the DSA in question. A DSA which is a master for the root Entry Data Block will have no pointers. However, it can go straight to the master root Entry Data Block, read the information about itself, and proceed as for other DSAs. It is believed that for early pilots, a high level of copying configuration data will be desirable to achieve robustness. The root and national EDBs will be very highly replicated, even though QUIPU can operate with a rather low level of replication. \section {DSA Naming} \label {dsa-naming} \subsection {Choice of Names to Prevent Loops} Care must be taken to prevent the situation where the location of a DSA is only known through itself (and other more complex variants). A simple rule for naming DSAs will ensure that this cannot happen. The master DSA for a given entry (i.e., the DSA controlling the Entry Data Block of containing the entry's children) should have its name in the Entry Data Block of the entry's parent or at a level higher in the DIT. For example, the master DSA of \begin{quote}\small\begin{verbatim} Country=UK, Org=University College London, OU=Computer Science \end{verbatim}\end{quote} which contains information on entries below Computer Science, may be labelled \begin{quote}\small\begin{verbatim} Country=UK, Org=University College London, DSA=Three Toed Sloth \end{verbatim}\end{quote} or \begin{quote}\small\begin{verbatim} Country=France, DSA=Capybara \end{verbatim}\end{quote} It may not be labelled \begin{quote}\small\begin{verbatim} Country=UK, Org=University College London, OU=Computer Science, DSA=Alpaca \end{verbatim}\end{quote} or \begin{quote}\small\begin{verbatim} Country=France, Org=Inria, DSA=Llama \end{verbatim}\end{quote} A little more flexibility could be allowed; However, this rule is simple, it prevents deadlock, and allows for reasonable labelling practices. The restriction may be relaxed somewhat, when the concept of Directory Management Domains is introduced more formally. \chapter{Access Control and Authentication} \input{q-secdes} \chapter {Replicating Updates} \label {des-first} \label {dist-update} \section {Basic Update Approach} \label {edb-ros} QUIPU supports a simple automatic update mechanism. This allows for copying of Entry Data Blocks, but with a simple check to ensure that only new information is copied. Slave copies are obtained by use of a new remote operation. The argument to the operation is the name of the Entry, and the version number of the copy of the Entry Data Block held locally. A FULL copy of the Entry Data Block is returned if this version is out of date. In the DSA's entry, there is a list of Entry Data Blocks for which the DSA has master or slave copies, and the DSA which it gets updates from. For each Entry Data Block, there is the list of DSAs which pull the Entry Data Block, and for slave copies, which DSA the update should come from. It is assumed that this operation will be invoked sufficiently often for it to be acceptable to consider the slave data as ``official''. For the type of usage being considered, this probably means several times per day. Within QUIPU, this operation is in a new protocol (QUIPU DSP), which will also contains the DSP ASEs. The operation is specified in Figure~\ref{getedb-py}. \tagrind {getedb}{EDB Access Operation}{getedb-py} \clearpage Note that a DSA receiving a GetEDB operation, should check the associated EDBInfo, to ensure that the DSA in question is allowed to pull a copy of this EDB. The operation may be used to determine which version of the EDB is currently master. This might be used when a query with dontUseCopy arrives, in order to determine whether slave information is accurate. This would be a big performance win for search and list operations, due to potnetila reduction in information transferred. \chapter {Implementation Choices} \section {DSA Structure} Whilst the operation of a QUIPU DSA is fast, its startup procedure is not, due to reading all of its data from a text file on disk. This long startup means that applications must be able to use multiple DSAs, to prevent lockout whilst the local DSA starts up. Also, the process structure of DSAs must be static. To provide a system with reasonable availability, particularly in view of the system's ability to perform extravagant searching, the basic DSA must be able to handle multiple calls. For this reason, apart from conformance issues, the DSA will be inherently asynchronous, and will need to have its own internal scheduling. Initially, this can be simple minded. However, we are providing a framework for a system which is very sophisticated in this area. The basic approach of the DSA is to have two (conceptually) co-routined modules, which are interfaced by in-core C structures. The first module is the DSA engine, which resolves inbound queries either by looking them up in its in-core data structures or by generating further queries. The second module is the protocol engine. This is responsible for opening and closing calls, and for mapping between OPDUs on the network and C structures to be handled by the DSA engine. The interface provided to the DSA is largely independent of DAP vs DSP. \subsection {Memory Structures} There are a number of structures which are of particular importance. They are summarised here: \begin{itemize} \item The Entry is as the basic component of the in-core tree, which is linked upwards and downwards between parents and children. The tree always starts at the root, even if there is no information beyond the RDN present. Where an entry corresponds to the base of an Entry Data Block, the parameters of the Entry Data Block are present. Siblings are linked in a chain. Each entry is represented by a `C' structure, which contains: \begin{itemize} \item Information on the linkage (hierarchical, and between siblings). \item How Entry Data Blocks are managed. \item How the ``special'' attributes (ACL, Schema, Alias, Password, DSA location info) are held. \end{itemize} \item There is a structure associated with each connection. This is used to represent actual and desired connections. These structures are linked into a list, and are the key point for the protocol module. They indicate a list of operations and tasks associated with each connection. When the DSA engine needs a connection, it will see if one is already open to the DSA in question. If it is not, a connection structure will be created, which the protocol engine will act on in due course. Similarly, the protocol engine will close down unneeded connections, possibly after some (intentional) delay. \item There is a Task structure associated with each query which arrives. This holds the {\em full} state of the task, so the the DSA can switch between tasks at intervals (typically when a network connection blocks). This points to the list of operations which have been generated by the task. This is the key structure for the DSA Engine. \item There is an Operation structure, associated with each pending operation. These structures are in mesh structure, arranged both by Task and Connection. \end{itemize} Multi-level priorities are associated with tasks and operations, which are used by both engines to control scheduling. This will be done in QUIPU 6.0 on two bases: \begin{itemize} \item The user specified priority \item The progress of the operation (long searches are downgraded in priority). \end{itemize} \subsection {Malloc} The above is optimised by careful use of malloc. \index{malloc} A purpose built malloc is used, this knows about the memory intensive DSA, and tries to ensure that data accessed at similar times during the DSA operation, will be stored in the same page of core memory. This has the effect that the number of page faults generated is significantly reduced (early results indicate a twenty percent improvement over the standard malloc supplied with SunOS4). \subsection {Disk Structures} All of the data for a given QUIPU DSA will be contained under a single (UNIX) directory. There will be a directory for each RDN, where the DSA in question has an Entry Data Block (which may be master, slave, or cached). The name of directory being that of the QUIPU text encoded RDN. Thus there will be a UNIX directory structure which corresponds to the portion of the DIT held in the DSA. There will be file in each RDN directory called ``EDB'', which has the authoritative version of the data, and one called ``EDB.bak'', which has the previous version. This might be extended to provide more comprehensive backup. When the system is booted, the following will occur: \begin{enumerate} \item Read tailoring information. \item Look for sequence of Entry Data Blocks implied by the DSAs name. These will usually be cached. for later reuse. If not, the default addresses must be used. \item With the DSA's own info available, read in the other master and slave Entry Data Blocks. \item Read in other Entry Data Blocks, checking for consistency. \end{enumerate} \section {OSI Choices} ROS (1988) and the implied protocols (ACSE and Presentation) will be used. Other combinations (e.g., TP0 over TCP or TP4 over CLNS may be used by bilateral agreement between DUA and DSA or DSA and DSA). To ensure full connectivity of the QUIPU Directory Service, one of the following conditions must be met: \begin{enumerate} \item Support of transport class 0 over international X.25(80) (This condition will be changed to support of CONS with access to the international X.25 subnetwork, when such a statement is realistic). \item Access to a DSA which will perform application relaying in line with the procedures of Section~\ref {tcp-only}. This will need a bilateral agreement. It is hoped to establish ``well known'' DSAs which can serve this function for the following well known networks: \begin{itemize} \item Janet \item The DARPA/NSF Internet \end{itemize} \end{enumerate}