⟦87bb1c860⟧

TextFile

@setfilename ../info/internals
@node GNU Emacs Internals, Standard Errors, Tips and Standards, Top
@comment  node-name,  next,  previous,  up
@appendix GNU Emacs Internals

This chapter describes many internal aspects of GNU Emacs that may only
be of interest to programmers.

@menu
* Building Emacs::      
* Object Internals::    
* Writing Emacs Primitives
* Garbage Collection::  
* Pure Storage::        
@end menu

@node Building Emacs, Pure Storage, GNU Emacs Internals, GNU Emacs Internals
@section Building Emacs

@cindex building Emacs
@pindex temacs

  To build Emacs, you first compile the C sources.  This produces a
program called @code{temacs}, also called a @dfn{Bare impure Emacs}.
This version of Emacs contains the Emacs Lisp interpreter and IO
routines, but not the editing commands.

@pindex loadup.el
  To create a working Emacs editor, you give the command @samp{temacs -l
loadup}.  This causes @code{temacs} to evaluate the Lisp files named in
the @file{loadup.el} file.  These create the normal Emacs editing
environment.

@pindex xemacs
  It takes long time for @code{temacs} to evaluate the Lisp files.
However, you can create a version of Emacs that starts more quickly by
@dfn{dumping} a complete Emacs to an executable called @code{xemacs}.
This version starts more quickly because the Lisp code does not have be
evaluated again.  This is what is you usually do when you build an
Emacs.  Renamed @code{emacs}, the @code{xemacs} executable is what most
people use for Emacs.  To produce it, give the command @samp{temacs -l
loadup dump}.

  The @code{xemacs} executable will automatically load a user's
@file{.emacs} file, or the default site initialization file.
This means that you can change the environment produced by a dumped
Emacs without rebuilding Emacs; or you can produce a version of Emacs
that suits you and is not the same as all the other instances of Emacs.

  On some systems, dumping does not work.  Then, you can start Emacs
with the @samp{temacs -l loadup} command.  This takes a long time, but
since you need to start Emacs once a day at most---and once a week or
less frequently if you never log out---the extra time is not much of a
problem.

@defun dump-emacs to-file from-file
@cindex unexec
  This function dumps the current state of Emacs into an executable file
@var{to-file}.  It takes symbols from @var{from-file} (this is normally
the executable file @file{temacs}).  See @code{Snarf-documentation} in
@ref{Documentation Strings}.
@end defun

@defun emacs-version
  This function returns a string describing the version of Emacs that is
running.  It is useful to include this string in bug reports.

@example
(emacs-version)
     @result{} "GNU Emacs 18.36.1 of Fri Feb 27 1987 on slug (berkeley-unix)"
@end example
@end defun

@defvar emacs-build-time
  The value of this global variable is the time at which Emacs was
built at the local site.

@example
emacs-build-time
     @result{} "Fri Feb 27 14:55:57 1987"
@end example
@end defvar

@defvar emacs-version
  The value of this variable is the version of Emacs being run.  It is a
string, e.g. @code{"18.36.1"}.
@end defvar

@node Pure Storage, Garbage Collection, Building Emacs, GNU Emacs Internals
@section Pure Storage

  There are two types of storage in GNU Emacs Lisp for user-created Lisp
objects: @dfn{normal storage} and @dfn{pure storage}.  Normal storage is where
all the new data which is created by any session with Emacs is kept.  When
a program conses up a list or the user defines a new function (or
loads a library), then that is placed in normal storage.

@cindex memory allocation
  If normal storage runs low, then Emacs requests the operating system to
allocate more memory in blocks of 1k bytes.  Each block is allocated for
one type only, meaning that symbols, conses, vectors, etc. are
segregated in distinct blocks in memory.

  After a certain amount of storage has been allocated (determined by
the variable @code{gc-cons-threshold}), the garbage collector is called
to collect all storage which has been used and abandoned.

  Pure storage is unique in that it is not expandable, it will not be
collected by the garbage collector, and it is also sharable---meaning
that if two people are running Emacs, the operating system may choose to
let them share the exact same memory for all the sharable portions of
Emacs.

  In essence, pure storage is used when initially building Emacs from the
sources, loading all of the normal files which everyone will want.  Normal
storage is used during sessions with Emacs.  

  There is a finite amount of space allocated to pure storage, and this
amount is based upon the number of functions there are that are normally
loaded when building Emacs.  If you load more functions into pure
storage, you should increase this value.  The amount of pure storage
allocated to Emacs is set in the @file{emacs/src/config.h} file.

@defun purecopy object
  This function makes a copy of @var{object} in pure storage and returns
it.  It copies strings by simply making a new string with the same
characters in pure storage.  It recursively copies the contents of vectors
and cons cells.

  It does not make copies of symbols, or any other objects, but just
returns them unchanged.  It signals an error if asked to copy markers.

This function is used only while Emacs is being built and dumped, and
appears only in the @file{emacs/lisp/loaddefs.el} file, among the Lisp
sources.
@end defun

@defvar pure-bytes-used
  The value of this variable is the number of bytes of pure storage
allocated so far.  Typically, in a dumped Emacs, this number is very
close to the total amount of pure storage that exists.
@end defvar

@defvar purify-flag
  This variable determines whether @code{defun} should make a copy of the
function definition in pure storage.  If it is non-@code{nil}, then the
function definition is copied into pure storage.

  This flag is @code{t} while loading all of the basic functions for
building Emacs initially (allowing those functions to be sharable and
non-collectible).  It is set to @code{nil} when Emacs is saved out
as @code{xemacs}.  The flag is set and reset in the C sources.

 You should not change this flag in a running Emacs.
@end defvar

@node Garbage Collection, Writing Emacs Primitives, Pure Storage, GNU Emacs Internals
@section Garbage Collection

  All functions which build new structures (be they lists, strings,
buffers, etc.) require storage space for those structures.  It is quite
common to use some storage for a while, and then release it, for
example, by killing a buffer or by deleting the last pointer to an
object.  Emacs provides a @dfn{garbage collector} to reclaim this
abandoned storage.  (`Garbage recycler' might be a more intuitive
metaphor for this function.)

  The garbage collector operates by scanning all the objects which are
accessible to the user and marking those that are in use.  This scan
includes all the symbols, their values and associated function
definitions, and any data presently on the stack.  Any objects which are
accessible through objects that are in use are also marked as being in
use.  Everything else is not in use and is therefore garbage.

  Unused cons cells are simply strung together onto a @dfn{free list} for
future allocation.  Strings are first ``compacted'', and then all unused
string space is made available to the string creation functions.

@quotation
@cindex Common Lisp garbage collection
@b{Common Lisp Note:} Unlike other Lisps, the garbage collector is not
called when storage is exhausted.  Instead, GNU Emacs Lisp simply
requests the Operating System to allocate more storage, and processing
continues until @code{gc-cons-threshold} bytes have been used.  

This means that you can run the garbage collector and then be sure that
you won't need another garbage collection until another
@code{gc-cons-threshold} bytes have been used.
@end quotation

@deffn Command garbage-collect
  This function reclaims storage used by Lisp objects that are no longer
needed.  It returns information on the amount of space in use.  Garbage
collection happens automatically if you use more than
@code{gc-cons-threshold} bytes of Lisp data since the previous garbage
collection.  You can also request it explicitly by calling this
function.

  @code{garbage-collect} returns a list containing the following
information:

@example

((@var{used-conses} . @var{free-conses})
 (@var{used-syms} . @var{free-syms})
 (@var{used-markers} . @var{free-markers})
 @var{used-string-chars} 
 @var{used-vector-slots})

(garbage-collect)
     @result{} ((3435 . 2332) (1688 . 0) (57 . 417) 24510 3839)
@end example

Here is a table explaining each element:

@table @var
@item used-conses
The number of cons cells in use.

@item free-conses
The number of cons cells for which space has been obtained from the
operating system, but that are not currently being used.

@item used-syms
The number of symbols in use.

@item free-syms
The number of symbols for which space has been obtained from the
operating system, but that are not currently being used.

@item used-markers
The number of markers in use.

@item free-markers
The number of markers for which space has been obtained from the
operating system, but that are not currently being used.

@item used-string-chars
The total size of all strings, in characters.

@item used-vector-slots
The total number of elements of existing vectors.
@end table
@end deffn

@defopt gc-cons-threshold
  The value of this variable is the number of bytes of storage that may be
used after one garbage collection before another one is automatically
called.  Storage is ``used'' by @dfn{consing} (a cons cell is eight
bytes), creating strings (one byte per character plus a few bytes of
overhead), adding text to a buffer, etc.

  The initial value is 100,000.
  This variable may be set to a value above 100,000 to reduce the
frequency of garbage collections.  You can make collections more
frequent by setting a smaller value, down to 10,000.  Setting it to a
value less than 10,000 will only have effect until the subsequent
garbage collection, at which time @code{garbage-collect} will set it
back to 10,000.
@end defopt

@node Writing Emacs Primitives, Object Internals, Garbage Collection, GNU Emacs Internals
@section Writing Emacs Primitives

@cindex primitive function internals

Certain functions, and all special forms, are written in C.  A
convenient interface is provided via a set of macros.  The only way to
really understand how to write new C code is to read the source;
however, some information will be provided here.

An example of a special form (an ordinary function would have the same
general appearance) is the definition of @code{or}, from @file{eval.c}.

@cindex garbage collection protection
@example
/* NOTE!!! Every function that can call EVAL must protect its args
 and temporaries from garbage collection while it needs them.
 The definition of `For' shows what you have to do.  */

DEFUN ("or", For, Sor, 0, UNEVALLED, 0,
  "Eval args until one of them yields non-NIL, then return that value.\n\
The remaining args are not evalled at all.\n\
If all args return NIL, return NIL.")
  (args)
     Lisp_Object args;
@{
  register Lisp_Object val;
  Lisp_Object args_left;
  struct gcpro gcpro1;

  if (NULL(args))
    return Qnil;

  args_left = args;
  GCPRO1 (args_left);

  do
    @{
      val = Feval (Fcar (args_left));
      if (!NULL (val))
        break;
      args_left = Fcdr (args_left);
    @}
  while (!NULL(args_left));

  UNGCPRO;
  return val;
@}
@end example

Here is a precise explanation of the arguments to the @code{DEFUN} macro:

@enumerate

@item
The first argument is the name of the function in Lisp; it will be
named @samp{or}.

@item
The second argument is the C function name for this function.  This is
the name that is used in C code for calling the function.  The name is,
by convention, @samp{F} prepended to the Lisp name, with all dashes
(@samp{-}) in the Lisp name changed to underscores.  Thus, if your C
code wishes to call this function, it will call @samp{For}.  Remember
that the arguments must be of type @code{Lisp_Object}; various macros
and functions for creating @code{Lisp_Object} are provided in the file
@file{lisp.h}.

@item
The third argument is the name of the C variable representing the Lisp
primitive that this function codes.  This name is by convention @samp{S}
prepended to the name, in the same manner that the function name is
created.

@item
The fourth argument is the minimum number of arguments that must be
provided; i.e., the number of required arguments.  In this case, no
arguments are required.

@item
The fifth argument is the maximum number of arguments that can be
provided.  Alternative, it can be @code{UNEVALLED}, indicating a special
form that receives unevaluated arguments.  A function with the
equivalent of an @code{&rest} argument would have @code{MANY} in this
position.  Both @code{UNEVALLED} and @code{MANY} are macros.
This argument must be one of these macros or a
number at least as large as the fourth argument.

@item
The sixth argument is an interactive specification exactly like the one
provided in Lisp.  In this case it is 0 (a null pointer), indicating
that this function cannot be called interactively.  A value of @code{""}
indicates an interactive function not taking arguments.

@item
The last argument is the documentation string.  It is written just like
a documentation string for a function defined in Lisp, except you must
write @samp{\n\} at the end of each line.  In particular, the first line
should be a single sentence.
@end enumerate

  Also, you must provide a list of arguments, and declare their types
(always @code{Lisp_objects}).

  If you are modifying a file that already has Lisp primitives defined
in it, find the function near the end of the file named
@code{syms-of-@var{something}}, and add a line of the form

@example
defsubr (&Sname);
@end example

@noindent
  If the file doesn't have this function, or you have created a new file,
add a @code{syms_of_@var{filename}} (e.g., @code{syms_of_eval}), and
find the spot in @file{emacs.c} where all of these functions are called.
Add a call to your symbol initialization function there.  This makes all
the subroutines (primitives) available from Lisp.

  Here is another function, with more complicated arguments.  This comes
from the code for the X window system, and it demonstrates the use of
macros and functions to manipulate Lisp objects.

@example
DEFUN ("coordinates-in-window-p", Fcoordinates_in_window_p,
  Scoordinates_in_window_p, 2, 2,
  "xSpecify coordinate pair: \nXExpression which evals to window: ",
  "Return non-nil if POSITIONS (a list, (SCREEN-X SCREEN-Y)) is in WINDOW.\n\  
  Returned value is list of positions expressed\n\
  relative to window upper left corner.")
  (coordinate, window)
     register Lisp_Object coordinate, window;
@{
  register Lisp_Object xcoord, ycoord;

  if (!LISTP  (coordinate)) wrong_type_argument (Qlistp, coordinate);
  CHECK_WINDOW (window, 2);
  xcoord = Fcar (coordinate);
  ycoord = Fcar (Fcdr (coordinate));
  CHECK_NUMBER (xcoord, 0);
  CHECK_NUMBER (ycoord, 1);
  if ((XINT (xcoord) < XINT (XWINDOW (window)->left)) ||
      (XINT (xcoord) >= (XINT (XWINDOW (window)->left) +
                         XINT (XWINDOW (window)->width))))
    @{
      return Qnil;
    @}
  XFASTINT (xcoord) -= XFASTINT (XWINDOW (window)->left);
  if (XINT (ycoord) == (screen_height - 1))
    return Qnil;
  if ((XINT (ycoord) < XINT (XWINDOW (window)->top)) ||
      (XINT (ycoord) >= (XINT (XWINDOW (window)->top) +
                         XINT (XWINDOW (window)->height)) - 1))
    @{
      return Qnil;
    @}
  XFASTINT (ycoord) -= XFASTINT (XWINDOW (window)->top);
  return (Fcons (xcoord, Fcons (ycoord, Qnil)));
@}
@end example

  There are similar equivalents for @code{defconst} and @code{defvar}, 
as well as a few others that have no equivalent in the Lisp interpreter.

  Note that you cannot directly call functions defined in Lisp as, for
example, the primitive function @code{Fcons} is called above.  You must
create the appropriate Lisp form, protect everything from garbage
collection, and @code{Feval} the form, as was done in @code{For} above.

  @file{eval.c} is a very good file to look through for examples;
@file{lisp.h} contains the definitions for some important macros and
functions.

@node Object Internals,  , Writing Emacs Primitives, GNU Emacs Internals
@section Object Internals
@cindex object internals

  GNU Emacs Lisp manipulates many different types of data.  The actual
data is stored in a heap and the only access that programs have to the
data is through pointers.  Pointers are thirty-two bits wide in most
implementations.  Depending on the operating system and type of machine
for which you compile Emacs, twenty-four to twenty-six bits are used to
indicate the object, and the remaining six to eight bits are used for a
tag that identifies the object's type.

  Because all access to data is through tagged pointers, it is always
possible to determine the type of any object.  This allows variables to
be untyped, and the values assigned to them to be changed without regard
to type.  Function arguments also can be of any type; if you want a
function to accept only a certain type of argument, you must check the
type explicitly using a suitable predicate (@pxref{Type Predicates}).
@cindex type checking internals

@menu
* Buffer Internals::    
* Window Internals::    
* Process Internals::   
@end menu

@c !!! perhaps reformat the list of type tags with more explanation
@cindex object type tags
  Emacs has more than twenty types of tags; here is the list, from
@file{emacs/src/lisp.h}: integer, symbol, marker, string, vector of Lisp
objects, cons, byte-compiled function, editor buffer, built-in function,
internal value return by subroutines of read, forwarding pointer to an
int variable, boolean forwarding pointer to an int variable, object
describing a connection to a subprocess, forwarding pointer to a
Lisp_Object variable, Pointer to a vector-like object describing a
display screen, Lisp_Internal_Stream, Lisp_Buffer_Local_Value,
Lisp_Some_Buffer_Local_Value, Lisp_Buffer_Objfwd, Lisp_Void, Window used
for Emacs display, Lisp_Window_Configuration.

@node Buffer Internals, Window Internals, Object Internals, Object Internals
@subsection Buffer Internals

@cindex buffer internals

  Buffers have a set fields which are not directly accessible by the
Lisp programmer.  But there are often functions which can access and
change their values even though the names given to them are not usable
by the programmer in any way.

  The fields are (as of Emacs 18):

@table @code
@item name
The buffer name is a string which names the buffer.  It is guaranteed to
be unique.  @xref{Buffer Names}.

@item save_modified
This field contains the time when the buffer was last saved.
@xref{Buffer Modification}.

@item modtime
This field contains modification time of the visited file.  It is
set when the file is written or read.  Every time the buffer is written
to the file, this field is compared to the modification of the
file.  @xref{Buffer Modification}.

@item auto_save_modified
This field contains the time when the buffer was last auto-saved.

@item last_window_start
This field contains the position in the buffer at which the display
started the last time the buffer was displayed in a window.

@item undodata
This field contains records which tell Emacs how it can undo
the last set of changes to the buffer.  @xref{Undo}.

@item syntax_table_v
This field contains the syntax table for the buffer.  @xref{Syntax Tables}.

@item markers
This field contains the chain of all markers that point into the
buffer.  At each deletion or motion of the buffer gap, all of these
markers must be checked and perhaps updated.  @xref{Markers}.

@item backed_up
This field is a flag which tells if the visited file has been
backed up.

@item mark
This field contains the mark for the buffer.  The mark is a marker,
hence it is also included on the list @code{markers}.

@item local_var_alist
This field contains the association list containing all of the
local fields and their associated values.  A copy of this list
is returned by the function @code{buffer-local-variables}.

@item mode_line_format
@xref{Mode Line Format}.
@end table

@node Window Internals, Process Internals, Buffer Internals, Object Internals
@subsection Window Internals

@cindex window internals

Windows have the following accessible fields:

@table @code

@item height
  The height of the window, measured in lines.  All windows (save the
minibuffer window) have a minimum height of two lines, one of which is the
mode line.

@item width
  The width of the window, measured in columns.  There is no limit on the
width of a window, although a width of less than 2 does not allow for any
characters at all to be displayed (as one line is devoted to the horizontal
division line).

@item buffer
  The buffer which the window is displaying.  This may change often during
the life of the window.

@item start
 The position in the buffer which is the first character to be displayed in
the buffer.  This is always located in the upper left corner (location
@code{(0,0)}).  There is no restriction on which character in a buffer this
is, although it is common for it to be the first character on a line.

@item pointm
@cindex window point internals
  This is the point of the current buffer when this window is selected;
when it is not selected, it retains its previous value.  When reselected, this
once again becomes the point for the current buffer (assuming that the buffer
associated with the window has not changed).

  This allows Emacs to maintain different points in different windows, even
when they display the same buffer.  (This variable is only useful when there
are multiple windows displaying the same buffer.)

@item left
  This is the left-hand edge of the window, measured in columns.  (The
left-most column on the screen is column 0.)

@item top
  This is the top edge of the window, measured in lines.  (The top line on
the screen is line 0.)

@item next
  This is the window that is the next in the chain of siblings.

@item prev
  This is the window that is the previous in the chain of siblings.

@item force-start
  This says that the next redisplay may not scroll the text heuristically.

@item hscroll
  This is the number of columns that the display in the window is scrolled
horizontally to the left.  Normally this is 0.

@item use-time
  This is the last time that the window was selected.  This field is used
by @code{get-lru-window}.

@end table

@node Process Internals,  , Window Internals, Object Internals
@subsection Process Internals

@cindex process internals

The fields of a process are:

@table @code
@item name
A string: the name used when creating the process, or a variant of it.
The name is normally created from the name of the program which it is
running.

@item command
A list: the command arguments that this process was created with.

@item filter
A function (or a symbol which names it): used to accept output from the
process instead of a buffer.

@item sentinel
A function: to be called whenever the process receives a signal.

@item buffer
A buffer: where standard output is directed to if filter does not exist.

@item pid
An integer: the Unix process ID.

@item command_channel_p
 A flag: Non-@code{nil} if this is really a command channel instead of a
process.  (This is not really used.)

@item childp
A flag: Non-@code{nil} if this is really a child process.

@item flags
A symbol: representing the state of the process, @code{run}, @code{stop},
@code{closed}, etc.

@item reason
A number: the Unix signal that the process received that caused the process
to stop (The process is not necessarily dead).  If the process has died,
and was not killed by a signal, then this is the code the process exited
with.

@item mark
A marker: set to end of last output from this process inserted into the
buffer.  Normally this will be the end of the buffer.

@item kill_without_query
A flag: Non-@code{nil} means kill this process silently if Emacs is exited.
@end table
DataMuseum.dk

DKUUG/EUUG Conference tapes

⟦87bb1c860⟧ TextFile

Derivation

TextFile