|
DataMuseum.dkPresents historical artifacts from the history of: DKUUG/EUUG Conference tapes |
This is an automatic "excavation" of a thematic subset of
See our Wiki for more about DKUUG/EUUG Conference tapes Excavated with: AutoArchaeologist - Free & Open Source Software. |
top - metrics - downloadIndex: T i
Length: 24265 (0x5ec9) Types: TextFile Names: »internals.texinfo«
└─⟦a05ed705a⟧ Bits:30007078 DKUUG GNU 2/12/89 └─⟦c06c473ab⟧ »./UNRELEASED/lispref.tar.Z« └─⟦1b57a2ffe⟧ └─⟦this⟧ »internals.texinfo«
@setfilename ../info/internals @node GNU Emacs Internals, Standard Errors, Tips and Standards, Top @comment node-name, next, previous, up @appendix GNU Emacs Internals This chapter describes many internal aspects of GNU Emacs that may only be of interest to programmers. @menu * Building Emacs:: * Object Internals:: * Writing Emacs Primitives * Garbage Collection:: * Pure Storage:: @end menu @node Building Emacs, Pure Storage, GNU Emacs Internals, GNU Emacs Internals @section Building Emacs @cindex building Emacs @pindex temacs To build Emacs, you first compile the C sources. This produces a program called @code{temacs}, also called a @dfn{Bare impure Emacs}. This version of Emacs contains the Emacs Lisp interpreter and IO routines, but not the editing commands. @pindex loadup.el To create a working Emacs editor, you give the command @samp{temacs -l loadup}. This causes @code{temacs} to evaluate the Lisp files named in the @file{loadup.el} file. These create the normal Emacs editing environment. @pindex xemacs It takes long time for @code{temacs} to evaluate the Lisp files. However, you can create a version of Emacs that starts more quickly by @dfn{dumping} a complete Emacs to an executable called @code{xemacs}. This version starts more quickly because the Lisp code does not have be evaluated again. This is what is you usually do when you build an Emacs. Renamed @code{emacs}, the @code{xemacs} executable is what most people use for Emacs. To produce it, give the command @samp{temacs -l loadup dump}. The @code{xemacs} executable will automatically load a user's @file{.emacs} file, or the default site initialization file. This means that you can change the environment produced by a dumped Emacs without rebuilding Emacs; or you can produce a version of Emacs that suits you and is not the same as all the other instances of Emacs. On some systems, dumping does not work. Then, you can start Emacs with the @samp{temacs -l loadup} command. This takes a long time, but since you need to start Emacs once a day at most---and once a week or less frequently if you never log out---the extra time is not much of a problem. @defun dump-emacs to-file from-file @cindex unexec This function dumps the current state of Emacs into an executable file @var{to-file}. It takes symbols from @var{from-file} (this is normally the executable file @file{temacs}). See @code{Snarf-documentation} in @ref{Documentation Strings}. @end defun @defun emacs-version This function returns a string describing the version of Emacs that is running. It is useful to include this string in bug reports. @example (emacs-version) @result{} "GNU Emacs 18.36.1 of Fri Feb 27 1987 on slug (berkeley-unix)" @end example @end defun @defvar emacs-build-time The value of this global variable is the time at which Emacs was built at the local site. @example emacs-build-time @result{} "Fri Feb 27 14:55:57 1987" @end example @end defvar @defvar emacs-version The value of this variable is the version of Emacs being run. It is a string, e.g. @code{"18.36.1"}. @end defvar @node Pure Storage, Garbage Collection, Building Emacs, GNU Emacs Internals @section Pure Storage There are two types of storage in GNU Emacs Lisp for user-created Lisp objects: @dfn{normal storage} and @dfn{pure storage}. Normal storage is where all the new data which is created by any session with Emacs is kept. When a program conses up a list or the user defines a new function (or loads a library), then that is placed in normal storage. @cindex memory allocation If normal storage runs low, then Emacs requests the operating system to allocate more memory in blocks of 1k bytes. Each block is allocated for one type only, meaning that symbols, conses, vectors, etc. are segregated in distinct blocks in memory. After a certain amount of storage has been allocated (determined by the variable @code{gc-cons-threshold}), the garbage collector is called to collect all storage which has been used and abandoned. Pure storage is unique in that it is not expandable, it will not be collected by the garbage collector, and it is also sharable---meaning that if two people are running Emacs, the operating system may choose to let them share the exact same memory for all the sharable portions of Emacs. In essence, pure storage is used when initially building Emacs from the sources, loading all of the normal files which everyone will want. Normal storage is used during sessions with Emacs. There is a finite amount of space allocated to pure storage, and this amount is based upon the number of functions there are that are normally loaded when building Emacs. If you load more functions into pure storage, you should increase this value. The amount of pure storage allocated to Emacs is set in the @file{emacs/src/config.h} file. @defun purecopy object This function makes a copy of @var{object} in pure storage and returns it. It copies strings by simply making a new string with the same characters in pure storage. It recursively copies the contents of vectors and cons cells. It does not make copies of symbols, or any other objects, but just returns them unchanged. It signals an error if asked to copy markers. This function is used only while Emacs is being built and dumped, and appears only in the @file{emacs/lisp/loaddefs.el} file, among the Lisp sources. @end defun @defvar pure-bytes-used The value of this variable is the number of bytes of pure storage allocated so far. Typically, in a dumped Emacs, this number is very close to the total amount of pure storage that exists. @end defvar @defvar purify-flag This variable determines whether @code{defun} should make a copy of the function definition in pure storage. If it is non-@code{nil}, then the function definition is copied into pure storage. This flag is @code{t} while loading all of the basic functions for building Emacs initially (allowing those functions to be sharable and non-collectible). It is set to @code{nil} when Emacs is saved out as @code{xemacs}. The flag is set and reset in the C sources. You should not change this flag in a running Emacs. @end defvar @node Garbage Collection, Writing Emacs Primitives, Pure Storage, GNU Emacs Internals @section Garbage Collection All functions which build new structures (be they lists, strings, buffers, etc.) require storage space for those structures. It is quite common to use some storage for a while, and then release it, for example, by killing a buffer or by deleting the last pointer to an object. Emacs provides a @dfn{garbage collector} to reclaim this abandoned storage. (`Garbage recycler' might be a more intuitive metaphor for this function.) The garbage collector operates by scanning all the objects which are accessible to the user and marking those that are in use. This scan includes all the symbols, their values and associated function definitions, and any data presently on the stack. Any objects which are accessible through objects that are in use are also marked as being in use. Everything else is not in use and is therefore garbage. Unused cons cells are simply strung together onto a @dfn{free list} for future allocation. Strings are first ``compacted'', and then all unused string space is made available to the string creation functions. @quotation @cindex Common Lisp garbage collection @b{Common Lisp Note:} Unlike other Lisps, the garbage collector is not called when storage is exhausted. Instead, GNU Emacs Lisp simply requests the Operating System to allocate more storage, and processing continues until @code{gc-cons-threshold} bytes have been used. This means that you can run the garbage collector and then be sure that you won't need another garbage collection until another @code{gc-cons-threshold} bytes have been used. @end quotation @deffn Command garbage-collect This function reclaims storage used by Lisp objects that are no longer needed. It returns information on the amount of space in use. Garbage collection happens automatically if you use more than @code{gc-cons-threshold} bytes of Lisp data since the previous garbage collection. You can also request it explicitly by calling this function. @code{garbage-collect} returns a list containing the following information: @example ((@var{used-conses} . @var{free-conses}) (@var{used-syms} . @var{free-syms}) (@var{used-markers} . @var{free-markers}) @var{used-string-chars} @var{used-vector-slots}) (garbage-collect) @result{} ((3435 . 2332) (1688 . 0) (57 . 417) 24510 3839) @end example Here is a table explaining each element: @table @var @item used-conses The number of cons cells in use. @item free-conses The number of cons cells for which space has been obtained from the operating system, but that are not currently being used. @item used-syms The number of symbols in use. @item free-syms The number of symbols for which space has been obtained from the operating system, but that are not currently being used. @item used-markers The number of markers in use. @item free-markers The number of markers for which space has been obtained from the operating system, but that are not currently being used. @item used-string-chars The total size of all strings, in characters. @item used-vector-slots The total number of elements of existing vectors. @end table @end deffn @defopt gc-cons-threshold The value of this variable is the number of bytes of storage that may be used after one garbage collection before another one is automatically called. Storage is ``used'' by @dfn{consing} (a cons cell is eight bytes), creating strings (one byte per character plus a few bytes of overhead), adding text to a buffer, etc. The initial value is 100,000. This variable may be set to a value above 100,000 to reduce the frequency of garbage collections. You can make collections more frequent by setting a smaller value, down to 10,000. Setting it to a value less than 10,000 will only have effect until the subsequent garbage collection, at which time @code{garbage-collect} will set it back to 10,000. @end defopt @node Writing Emacs Primitives, Object Internals, Garbage Collection, GNU Emacs Internals @section Writing Emacs Primitives @cindex primitive function internals Certain functions, and all special forms, are written in C. A convenient interface is provided via a set of macros. The only way to really understand how to write new C code is to read the source; however, some information will be provided here. An example of a special form (an ordinary function would have the same general appearance) is the definition of @code{or}, from @file{eval.c}. @cindex garbage collection protection @example /* NOTE!!! Every function that can call EVAL must protect its args and temporaries from garbage collection while it needs them. The definition of `For' shows what you have to do. */ DEFUN ("or", For, Sor, 0, UNEVALLED, 0, "Eval args until one of them yields non-NIL, then return that value.\n\ The remaining args are not evalled at all.\n\ If all args return NIL, return NIL.") (args) Lisp_Object args; @{ register Lisp_Object val; Lisp_Object args_left; struct gcpro gcpro1; if (NULL(args)) return Qnil; args_left = args; GCPRO1 (args_left); do @{ val = Feval (Fcar (args_left)); if (!NULL (val)) break; args_left = Fcdr (args_left); @} while (!NULL(args_left)); UNGCPRO; return val; @} @end example Here is a precise explanation of the arguments to the @code{DEFUN} macro: @enumerate @item The first argument is the name of the function in Lisp; it will be named @samp{or}. @item The second argument is the C function name for this function. This is the name that is used in C code for calling the function. The name is, by convention, @samp{F} prepended to the Lisp name, with all dashes (@samp{-}) in the Lisp name changed to underscores. Thus, if your C code wishes to call this function, it will call @samp{For}. Remember that the arguments must be of type @code{Lisp_Object}; various macros and functions for creating @code{Lisp_Object} are provided in the file @file{lisp.h}. @item The third argument is the name of the C variable representing the Lisp primitive that this function codes. This name is by convention @samp{S} prepended to the name, in the same manner that the function name is created. @item The fourth argument is the minimum number of arguments that must be provided; i.e., the number of required arguments. In this case, no arguments are required. @item The fifth argument is the maximum number of arguments that can be provided. Alternative, it can be @code{UNEVALLED}, indicating a special form that receives unevaluated arguments. A function with the equivalent of an @code{&rest} argument would have @code{MANY} in this position. Both @code{UNEVALLED} and @code{MANY} are macros. This argument must be one of these macros or a number at least as large as the fourth argument. @item The sixth argument is an interactive specification exactly like the one provided in Lisp. In this case it is 0 (a null pointer), indicating that this function cannot be called interactively. A value of @code{""} indicates an interactive function not taking arguments. @item The last argument is the documentation string. It is written just like a documentation string for a function defined in Lisp, except you must write @samp{\n\} at the end of each line. In particular, the first line should be a single sentence. @end enumerate Also, you must provide a list of arguments, and declare their types (always @code{Lisp_objects}). If you are modifying a file that already has Lisp primitives defined in it, find the function near the end of the file named @code{syms-of-@var{something}}, and add a line of the form @example defsubr (&Sname); @end example @noindent If the file doesn't have this function, or you have created a new file, add a @code{syms_of_@var{filename}} (e.g., @code{syms_of_eval}), and find the spot in @file{emacs.c} where all of these functions are called. Add a call to your symbol initialization function there. This makes all the subroutines (primitives) available from Lisp. Here is another function, with more complicated arguments. This comes from the code for the X window system, and it demonstrates the use of macros and functions to manipulate Lisp objects. @example DEFUN ("coordinates-in-window-p", Fcoordinates_in_window_p, Scoordinates_in_window_p, 2, 2, "xSpecify coordinate pair: \nXExpression which evals to window: ", "Return non-nil if POSITIONS (a list, (SCREEN-X SCREEN-Y)) is in WINDOW.\n\ Returned value is list of positions expressed\n\ relative to window upper left corner.") (coordinate, window) register Lisp_Object coordinate, window; @{ register Lisp_Object xcoord, ycoord; if (!LISTP (coordinate)) wrong_type_argument (Qlistp, coordinate); CHECK_WINDOW (window, 2); xcoord = Fcar (coordinate); ycoord = Fcar (Fcdr (coordinate)); CHECK_NUMBER (xcoord, 0); CHECK_NUMBER (ycoord, 1); if ((XINT (xcoord) < XINT (XWINDOW (window)->left)) || (XINT (xcoord) >= (XINT (XWINDOW (window)->left) + XINT (XWINDOW (window)->width)))) @{ return Qnil; @} XFASTINT (xcoord) -= XFASTINT (XWINDOW (window)->left); if (XINT (ycoord) == (screen_height - 1)) return Qnil; if ((XINT (ycoord) < XINT (XWINDOW (window)->top)) || (XINT (ycoord) >= (XINT (XWINDOW (window)->top) + XINT (XWINDOW (window)->height)) - 1)) @{ return Qnil; @} XFASTINT (ycoord) -= XFASTINT (XWINDOW (window)->top); return (Fcons (xcoord, Fcons (ycoord, Qnil))); @} @end example There are similar equivalents for @code{defconst} and @code{defvar}, as well as a few others that have no equivalent in the Lisp interpreter. Note that you cannot directly call functions defined in Lisp as, for example, the primitive function @code{Fcons} is called above. You must create the appropriate Lisp form, protect everything from garbage collection, and @code{Feval} the form, as was done in @code{For} above. @file{eval.c} is a very good file to look through for examples; @file{lisp.h} contains the definitions for some important macros and functions. @node Object Internals, , Writing Emacs Primitives, GNU Emacs Internals @section Object Internals @cindex object internals GNU Emacs Lisp manipulates many different types of data. The actual data is stored in a heap and the only access that programs have to the data is through pointers. Pointers are thirty-two bits wide in most implementations. Depending on the operating system and type of machine for which you compile Emacs, twenty-four to twenty-six bits are used to indicate the object, and the remaining six to eight bits are used for a tag that identifies the object's type. Because all access to data is through tagged pointers, it is always possible to determine the type of any object. This allows variables to be untyped, and the values assigned to them to be changed without regard to type. Function arguments also can be of any type; if you want a function to accept only a certain type of argument, you must check the type explicitly using a suitable predicate (@pxref{Type Predicates}). @cindex type checking internals @menu * Buffer Internals:: * Window Internals:: * Process Internals:: @end menu @c !!! perhaps reformat the list of type tags with more explanation @cindex object type tags Emacs has more than twenty types of tags; here is the list, from @file{emacs/src/lisp.h}: integer, symbol, marker, string, vector of Lisp objects, cons, byte-compiled function, editor buffer, built-in function, internal value return by subroutines of read, forwarding pointer to an int variable, boolean forwarding pointer to an int variable, object describing a connection to a subprocess, forwarding pointer to a Lisp_Object variable, Pointer to a vector-like object describing a display screen, Lisp_Internal_Stream, Lisp_Buffer_Local_Value, Lisp_Some_Buffer_Local_Value, Lisp_Buffer_Objfwd, Lisp_Void, Window used for Emacs display, Lisp_Window_Configuration. @node Buffer Internals, Window Internals, Object Internals, Object Internals @subsection Buffer Internals @cindex buffer internals Buffers have a set fields which are not directly accessible by the Lisp programmer. But there are often functions which can access and change their values even though the names given to them are not usable by the programmer in any way. The fields are (as of Emacs 18): @table @code @item name The buffer name is a string which names the buffer. It is guaranteed to be unique. @xref{Buffer Names}. @item save_modified This field contains the time when the buffer was last saved. @xref{Buffer Modification}. @item modtime This field contains modification time of the visited file. It is set when the file is written or read. Every time the buffer is written to the file, this field is compared to the modification of the file. @xref{Buffer Modification}. @item auto_save_modified This field contains the time when the buffer was last auto-saved. @item last_window_start This field contains the position in the buffer at which the display started the last time the buffer was displayed in a window. @item undodata This field contains records which tell Emacs how it can undo the last set of changes to the buffer. @xref{Undo}. @item syntax_table_v This field contains the syntax table for the buffer. @xref{Syntax Tables}. @item markers This field contains the chain of all markers that point into the buffer. At each deletion or motion of the buffer gap, all of these markers must be checked and perhaps updated. @xref{Markers}. @item backed_up This field is a flag which tells if the visited file has been backed up. @item mark This field contains the mark for the buffer. The mark is a marker, hence it is also included on the list @code{markers}. @item local_var_alist This field contains the association list containing all of the local fields and their associated values. A copy of this list is returned by the function @code{buffer-local-variables}. @item mode_line_format @xref{Mode Line Format}. @end table @node Window Internals, Process Internals, Buffer Internals, Object Internals @subsection Window Internals @cindex window internals Windows have the following accessible fields: @table @code @item height The height of the window, measured in lines. All windows (save the minibuffer window) have a minimum height of two lines, one of which is the mode line. @item width The width of the window, measured in columns. There is no limit on the width of a window, although a width of less than 2 does not allow for any characters at all to be displayed (as one line is devoted to the horizontal division line). @item buffer The buffer which the window is displaying. This may change often during the life of the window. @item start The position in the buffer which is the first character to be displayed in the buffer. This is always located in the upper left corner (location @code{(0,0)}). There is no restriction on which character in a buffer this is, although it is common for it to be the first character on a line. @item pointm @cindex window point internals This is the point of the current buffer when this window is selected; when it is not selected, it retains its previous value. When reselected, this once again becomes the point for the current buffer (assuming that the buffer associated with the window has not changed). This allows Emacs to maintain different points in different windows, even when they display the same buffer. (This variable is only useful when there are multiple windows displaying the same buffer.) @item left This is the left-hand edge of the window, measured in columns. (The left-most column on the screen is column 0.) @item top This is the top edge of the window, measured in lines. (The top line on the screen is line 0.) @item next This is the window that is the next in the chain of siblings. @item prev This is the window that is the previous in the chain of siblings. @item force-start This says that the next redisplay may not scroll the text heuristically. @item hscroll This is the number of columns that the display in the window is scrolled horizontally to the left. Normally this is 0. @item use-time This is the last time that the window was selected. This field is used by @code{get-lru-window}. @end table @node Process Internals, , Window Internals, Object Internals @subsection Process Internals @cindex process internals The fields of a process are: @table @code @item name A string: the name used when creating the process, or a variant of it. The name is normally created from the name of the program which it is running. @item command A list: the command arguments that this process was created with. @item filter A function (or a symbol which names it): used to accept output from the process instead of a buffer. @item sentinel A function: to be called whenever the process receives a signal. @item buffer A buffer: where standard output is directed to if filter does not exist. @item pid An integer: the Unix process ID. @item command_channel_p A flag: Non-@code{nil} if this is really a command channel instead of a process. (This is not really used.) @item childp A flag: Non-@code{nil} if this is really a child process. @item flags A symbol: representing the state of the process, @code{run}, @code{stop}, @code{closed}, etc. @item reason A number: the Unix signal that the process received that caused the process to stop (The process is not necessarily dead). If the process has died, and was not killed by a signal, then this is the code the process exited with. @item mark A marker: set to end of last output from this process inserted into the buffer. Normally this will be the end of the buffer. @item kill_without_query A flag: Non-@code{nil} means kill this process silently if Emacs is exited. @end table