DataMuseum.dk

Presents historical artifacts from the history of:

DKUUG/EUUG Conference tapes

This is an automatic "excavation" of a thematic subset of
artifacts from Datamuseum.dk's BitArchive.

See our Wiki for more about DKUUG/EUUG Conference tapes

Excavated with: AutoArchaeologist - Free & Open Source Software.


top - metrics - download
Index: T s

⟦7ecff831d⟧ TextFile

    Length: 40056 (0x9c78)
    Types: TextFile
    Names: »searching.texinfo«

Derivation

└─⟦a05ed705a⟧ Bits:30007078 DKUUG GNU 2/12/89
    └─⟦c06c473ab⟧ »./UNRELEASED/lispref.tar.Z« 
        └─⟦1b57a2ffe⟧ 
            └─⟦this⟧ »searching.texinfo« 

TextFile

@setfilename ../info/searching
@node Searching and Matching, Syntax Tables, Text, Top
@chapter Searching and Matching

@cindex searching
@cindex matching

  GNU Emacs provides two ways to search through a buffer for specified
text: exact string searches and regular expression searches.
@dfn{Matching} may be used following a regular expression search to
specify the positions of sub-expressions found by a regular expression
search.  This chapter also describes replacement functions.

@menu
* Searching for Strings::	
* Regular Expressions::	
* Regular Expression Searching::	
* Replacement::	
* Match Data::	
* Standard Regexps::	
* Searching and Case::	
@end menu

@node Searching for Strings, Regular Expressions, Searching and Matching, Searching and Matching
@section Searching for Strings
@cindex string search

@deffn Command search-forward string &optional limit noerror repeat
  This function searches forward from point for an exact match for
@var{string}.  It sets point to the end of the occurrence found, and
returns @code{t}.

  In this example, point is positioned at the beginning of the line.
Then @code{(search-forward "fox")} is evaluated in the minibuffer and
point is left after the last letter of @samp{fox}:

@example
---------- Buffer: foo ----------
@point{}The quick brown fox jumped over the lazy dog.
---------- Buffer: foo ----------

(search-forward "fox")
     @result{} t

---------- Buffer: foo ----------
The quick brown fox@point{} jumped over the lazy dog.
---------- Buffer: foo ----------
@end example

  If @var{limit} is non-@code{nil} (it must be a position in the current
buffer), then it is the upper bound to the search.  No match extending
after that position is accepted.

@cindex search-failed error
  If @var{noerror} is @code{nil}, then a @code{search-failed} error is
signaled.  If @var{noerror} is @code{t}, then if the search fails, it
just returns @code{nil}, and doesn't signal an error.  If @var{noerror}
is neither @code{nil} nor @code{t}, then @code{search-forward} moves the
point to @var{limit} and returns @code{nil}.

  If @var{repeat} is non-@code{nil}, then the search is repeated that many
times, the point being positioned at the end of the last match.
@end deffn

@deffn Command search-backward string &optional limit noerror repeat
  This function searches backward from point for @var{string}.  It is the
exact analog of @code{search-forward}.  It leaves the point at the
beginning of the string matched.
@end deffn

@deffn Command word-search-forward string &optional limit noerror repeat
  This function searchs forward from the point for a ``word'' match for
@var{string}.  It sets the point to the end of the occurrence found, and
returns @code{t}.  

  A word search differs from a simple string search in that a word
search @strong{requires} that the words it searches for are separate
words (searching for the word @samp{ball} will not match the word
@samp{balls}), and punctuation and spacing is ignored (searching for
@samp{ball boy} will match @samp{ball.  Boy!}).

  In the example, the point is first placed at the beginning of the
buffer; the search leaves it between the @kbd{y} and the @kbd{!}.

@example
---------- Buffer: foo ----------
@point{}He said ``Please!  Find
the ball boy!''
---------- Buffer: foo ----------

(word-search-forward "Please find the ball, boy.")
     @result{} t

---------- Buffer: foo ----------
He said ``Please!  Find
the ball boy@point{}!''
---------- Buffer: foo ----------
@end example

  If @var{limit} is non-@code{nil} (it must be a position in the current
buffer), then it is the upper bound to the search.  The match found must
not extend after that position.

  If @var{noerror} is @code{t}, then @code{word-search-forward} returns
@code{nil} when a search fails, instead of signalling an error.  If
@var{noerror} is neither @code{nil} nor @code{t}, then
@code{word-search-forward} moves point to @var{limit} and returns
@code{nil}.

  If @var{repeat} is non-@code{nil}, then the search is repeated that many
times, the point being positioned at the end of the last match.

  When @code{word-search-forward} is called interactively, Emacs prompts
you for the search string; @var{limit} and @var{noerror}, are set to
@code{nil}, and @var{repeat} is set to 1.
@end deffn

@deffn Command word-search-backward string
  This function searches backward from the point for a word match to
@var{string}.  This function is the exact analog to
@code{word-search-forward}.
@end deffn

@node Regular Expressions, Regular Expression Searching, Searching for Strings, Searching and Matching
@section Regular Expression Syntax

@cindex patterns
@cindex regular expression syntax

  A @dfn{regular expression} (@dfn{regexp}, for short) is a pattern that
denotes a set of strings, possibly an infinite set.  Searching for
matches for a regexp is a very powerful operation.  In GNU Emacs, you
can search for the next match for a regexp either incrementally or not.
Incremental search commands are described in the @cite{The GNU Emacs
User Manual}.  @xref{Regexp Search, , Regular Expression Search, emacs,
The GNU Emacs Manual}.

  Also, @pxref{Major Modes, Major Modes}.

@menu
* Complex Regexp Example::      Illustrates regular expression syntax.
@end menu

@cindex invalid-regexp error
  Regular expressions have a syntax in which a few characters are special
constructs and the rest are @dfn{ordinary}.  An ordinary character is a
simple regular expression which matches that character and nothing else.
The special characters are @samp{$}, @samp{^}, @samp{.}, @samp{*},
@samp{+}, @samp{?}, @samp{[}, @samp{]} and @samp{\}; no new special
characters will be defined in the future.  Any other character appearing
in a regular expression is ordinary, unless a @samp{\} precedes it.  If
a regular expression is malformed, an @code{invalid-regexp} error
signaled.  @refill

For example, @samp{f} is not a special character, so it is ordinary, and
therefore @samp{f} is a regular expression that matches the string
@samp{f} and no other string.  (It does @emph{not} match the string
@samp{ff}.)  Likewise, @samp{o} is a regular expression that matches
only @samp{o}.@refill

Any two regular expressions @var{a} and @var{b} can be concatenated.  The
result is a regular expression which matches a string if @var{a} matches
some amount of the beginning of that string and @var{b} matches the rest of
the string.@refill

As a simple example, we can concatenate the regular expressions @samp{f}
and @samp{o} to get the regular expression @samp{fo}, which matches only
the string @samp{fo}.  Still trivial.  To do something nontrivial, you
need to use one of the special characters.  Here is a list of them:

@table @kbd
@item .@: @r{(Period)}
@cindex @samp{.} in regexp
is a special character that matches any single character except a newline.
Using concatenation, we can make regular expressions like @samp{a.b} which
matches any three-character string which begins with @samp{a} and ends with
@samp{b}.@refill

@item *
@cindex @samp{*} in regexp
is not a construct by itself; it is a suffix, which means the
preceding regular expression is to be repeated as many times as
possible.  In @samp{fo*}, the @samp{*} applies to the @samp{o}, so
@samp{fo*} matches one @samp{f} followed by any number of @samp{o}s.
The case of zero @samp{o}s is allowed: @samp{fo*} does match
@samp{f}.@refill

@samp{*} always applies to the @emph{smallest} possible preceding
expression.  Thus, @samp{fo*} has a repeating @samp{o}, not a
repeating @samp{fo}.@refill

The matcher processes a @samp{*} construct by matching, immediately,
as many repetitions as can be found.  Then it continues with the rest
of the pattern.  If that fails, backtracking occurs, discarding some
of the matches of the @samp{*}-modified construct in case that makes
it possible to match the rest of the pattern.  For example, matching
@samp{ca*ar} against the string @samp{caaar}, the @samp{a*} first
tries to match all three @samp{a}s; but the rest of the pattern is
@samp{ar} and there is only @samp{r} left to match, so this try fails.
The next alternative is for @samp{a*} to match only two @samp{a}s.
With this choice, the rest of the regexp matches successfully.@refill

@item +
@cindex @samp{+} in regexp
Is a suffix character similar to @samp{*} except that it requires that
the preceding expression be matched at least once.  So, for example,
@samp{ca+r} will match the strings @samp{car} and @samp{caaaar}
but not the string @samp{cr}, whereas @samp{ca*r} would match all
three strings.@refill

@item ?
@cindex @samp{?} in regexp
Is a suffix character similar to @samp{*} except that it can match the
preceding expression either once or not at all.  For example,
@samp{ca?r} will match @samp{car} or @samp{cr}; nothing else.

@item [ @dots{} ]
@cindex @samp{[} in regexp
@cindex @samp{]} in regexp
@samp{[} begins a @dfn{character set}, which is terminated by a
@samp{]}.  In the simplest case, the characters between the two form
the set.  Thus, @samp{[ad]} matches either one @samp{a} or one
@samp{d}, and @samp{[ad]*} matches any string composed of just
@samp{a}s and @samp{d}s (including the empty string), from which it
follows that @samp{c[ad]*r} matches @samp{cr}, @samp{car}, @samp{cdr},
@samp{caddaar}, etc.@refill

Character ranges can also be included in a character set, by writing
two characters with a @samp{-} between them.  Thus, @samp{[a-z]}
matches any lower-case letter.  Ranges may be intermixed freely with
individual characters, as in @samp{[a-z$%.]}, which matches any lower
case letter or @samp{$}, @samp{%} or period.@refill

Note that the usual special characters are not special any more inside
a character set.  A completely different set of special characters
exists inside character sets: @samp{]}, @samp{-} and @samp{^}.@refill

To include a @samp{]} in a character set, you must make it the first
character.  For example, @samp{[]a]} matches @samp{]} or @samp{a}.  To
include a @samp{-}, write @samp{---}, which is a range containing only
@samp{-}.  To include @samp{^}, make it other than the first character
in the set.@refill

@item [^ @dots{} ]
@cindex @samp{^} in regexp
@samp{[^} begins a @dfn{complement character set}, which matches any
character except the ones specified.  Thus, @samp{[^a-z0-9A-Z]}
matches all characters @emph{except} letters and digits.@refill

@samp{^} is not special in a character set unless it is the first
character.  The character following the @samp{^} is treated as if it
were first (@samp{-} and @samp{]} are not special there).

Note that a complement character set can match a newline, unless
newline is mentioned as one of the characters not to match.

@item ^
@cindex @samp{^} in regexp
@cindex beginning of line
is a special character that matches the empty string, but only if at
the beginning of a line in the text being matched.  Otherwise it fails
to match anything.  Thus, @samp{^foo} matches a @samp{foo} which occurs
at the beginning of a line.

When matching a string, @samp{^} matches at the beginning of the string
or after a newline character @samp{\n}. 

@item $
@cindex @samp{$} in regexp
is similar to @samp{^} but matches only at the end of a line.  Thus,
@samp{xx*$} matches a string of one @samp{x} or more at the end of a line.

When matching a string, @samp{^} matches at the end of the string
or before a newline character @samp{\n}.

@item \
@cindex @samp{\} in regexp
has two functions: it quotes the special characters (including
@samp{\}), and it introduces additional special constructs.

Because @samp{\} quotes special characters, @samp{\$} is a regular
expression which matches only @samp{$}, and @samp{\[} is a regular
expression which matches only @samp{[}, and so on.

Note that @samp{\} also has special meaning inside the read syntax of
Lisp strings (@pxref{String Type}).  Therefore, to build a regular
expression that matches the @samp{\} character, you must preceed each
@samp{\} in @code{"\\"} with another @samp{\}, i.e., @code{"\\\\"}.
@refill
@end table

Note: for historical compatibility, special characters are treated as
ordinary ones if they are in contexts where their special meanings make no
sense.  For example, @samp{*foo} treats @samp{*} as ordinary since there is
no preceding expression on which the @samp{*} can act.  It is poor practice
to depend on this behavior; better to quote the special character anyway,
regardless of where is appears.@refill

For the most part, @samp{\} followed by any character matches only
that character.  However, there are several exceptions: characters
which, when preceded by @samp{\}, are special constructs.  Such
characters are always ordinary when encountered on their own.  Here
is a table of @samp{\} constructs.

@table @kbd
@item \|
@cindex @samp{|} in regexp
@cindex regexp alternative
specifies an alternative.
Two regular expressions @var{a} and @var{b} with @samp{\|} in
between form an expression that matches anything that either @var{a} or
@var{b} will match.@refill

Thus, @samp{foo\|bar} matches either @samp{foo} or @samp{bar}
but no other string.@refill

@samp{\|} applies to the largest possible surrounding expressions.  Only a
surrounding @samp{\( @dots{} \)} grouping can limit the grouping power of
@samp{\|}.@refill

Full backtracking capability exists to handle multiple uses of @samp{\|}.

@item \( @dots{} \)
@cindex @samp{(} in regexp
@cindex @samp{)} in regexp
@cindex regexp grouping
is a grouping construct that serves three purposes:

@enumerate
@item
To enclose a set of @samp{\|} alternatives for other operations.
Thus, @samp{\(foo\|bar\)x} matches either @samp{foox} or @samp{barx}.

@item
To enclose a complicated expression for the postfix @samp{*} to operate on.
Thus, @samp{ba\(na\)*} matches @samp{bananana}, etc., with any (zero or
more) number of @samp{na} strings.@refill

@item
To mark a matched substring for future reference.

@end enumerate

This last application is not a consequence of the idea of a
parenthetical grouping; it is a separate feature which happens to be
assigned as a second meaning to the same @samp{\( @dots{} \)} construct
because there is no conflict in practice between the two meanings.
Here is an explanation of this feature:

@item \@var{digit}
after the end of a @samp{\( @dots{} \)} construct, the matcher remembers the
beginning and end of the text matched by that construct.  Then, later on
in the regular expression, you can use @samp{\} followed by @var{digit}
to mean ``match the same text matched the @var{digit}'th time by the
@samp{\( @dots{} \)} construct.''@refill

The strings matching the first nine @samp{\( @dots{} \)} constructs appearing
in a regular expression are assigned numbers 1 through 9 in order that the
open-parentheses appear in the regular expression.  @samp{\1} through
@samp{\9} may be used to refer to the text matched by the corresponding
@samp{\( @dots{} \)} construct.

For example, @samp{\(.*\)\1} matches any newline-free string that is
composed of two identical halves.  The @samp{\(.*\)} matches the first
half, which may be anything, but the @samp{\1} that follows must match
the same exact text.

@item \`
@cindex @samp{`} in regexp
matches the empty string, provided it is at the beginning
of the buffer.

@item \'
@cindex @samp{'} in regexp
matches the empty string, provided it is at the end of
the buffer.

@item \b
@cindex @samp{\b} in regexp
matches the empty string, provided it is at the beginning or
end of a word.  Thus, @samp{\bfoo\b} matches any occurrence of
@samp{foo} as a separate word.  @samp{\bballs?\b} matches
@samp{ball} or @samp{balls} as a separate word.@refill

@item \B
@cindex @samp{\B} in regexp
matches the empty string, provided it is @emph{not} at the beginning or
end of a word.

@item \<
@cindex @samp{\<} in regexp
matches the empty string, provided it is at the beginning of a word.

@item \>
@cindex @samp{\>} in regexp
matches the empty string, provided it is at the end of a word.

@item \w
@cindex @samp{\w} in regexp
matches any word-constituent character.  The editor syntax table
determines which characters these are.

@item \W
@cindex @samp{\W} in regexp
matches any character that is not a word-constituent.

@item \s@var{code}
@cindex @samp{\s} in regexp
matches any character whose syntax is @var{code}.  @var{code} is a
character which represents a syntax code: thus, @samp{w} for word
constituent, @samp{-} for whitespace, @samp{(} for open-parenthesis,
etc.  @xref{Syntax Tables}.@refill

@item \S@var{code}
@cindex @samp{\S} in regexp
matches any character whose syntax is not @var{code}.
@end table

@node Complex Regexp Example,  , Regular Expressions, Regular Expressions
@comment  node-name,  next,  previous,  up
@subsection Complex Regexp Example

  Here is a complicated regexp, used by Emacs to recognize the end of a
sentence together with any whitespace that follows.  It is the value of
the variable @code{sentence-end}.  

First, the regexp is given in Lisp syntax to enable you to distinguish
the spaces from the tab characters.  In Lisp syntax, the string constant
begins and ends with a double-quote.  @samp{\"} stands for a
double-quote as part of the regexp, @samp{\\} for a backslash as part of
the regexp, @samp{\t} for a tab and @samp{\n} for a newline.

@example
"[.?!][]\"')@}]*\\($\\|\t\\|  \\)[\n]*"
@end example

In contrast, if you evaluate the variable @code{sentence-end}, you will
see the following:

@example
sentence-end
@result{}
"[.?!][]\"')@}]*\\($\\|  \\|  \\)[       
]*"
@end example

@noindent
In this case, the tab and carriage return are the actual characters.

This regular expression contains four parts in succession and can be
decyphered as follows:

@table @code
@item [.?!]
The first part of the pattern consists of three characters, a period, a
question mark and an exclamation mark, within square brackets.  The
match must begin with one or other of these characters.

@item []\"')@}]*
The second part of the pattern is the group of closing braces and
quotation marks, which can appear zero or more times.  These may follow
the period, question mark or exclamation mark.  The @code{\"} is Lisp
syntax for a double quote in a string.  The asterisk, @samp{*},
indicates that the items in the previous group (the group surrounded by
square brackets, @samp{[]}) may be repeated zero or more times.

@item \\($\\|\t\\|  \\)
The third part of the pattern is one or other of: either the end of a
line, or a tab, or two blank spaces.  The double back-slashes are used
to prevent Emacs from reading the parentheses and vertical bars as part
of the search pattern; the parentheses are used to mark the group and
the vertical bars are used to indicated that the patterns to either side
of them are alternatives.  The dollar sign is used to indicate the end
of a line.  The @key{TAB} character is inserted using @kbd{\t} and the
two spaces are inserted as is.

@item [\n]*"
Finally, the last part of the pattern indicates that the end of the line
or the whitespace following the period, question mark or exclamation
mark may, but need not, be followed by one or more carriage returns.  
@end table

@defun regexp-quote string
  This function returns a regular expression string which matches exactly
@var{string} and nothing else.  This allows you to request an exact
string match when calling a function that wants a regular expression.

@example
(regexp-quote "^The cat$")
     @result{} "\\^The cat\\$"
@end example

One use of @code{regexp-quote} is to combine an exact string match with
context described as a regular expression.  For example, this searches
for the string which is the value of @code{string}, surrounded by
whitespace:

@example
(re-search-forward (concat "\\s " (regexp-quote string) "\\s "))
@end example
@end defun

@node Regular Expression Searching, Replacement, Regular Expressions, Searching and Matching
@section Regular Expression Searching

@cindex regular expression searching

  A cluster of functions provide various features involved with regular
expression searches.  The primary function is @code{re-search-forward}.

@deffn Command re-search-forward regexp &optional limit noerror repeat
  This function searches forward in the current buffer for a string of
text that is matched by the regular expression @var{regexp}.  The
function skips over any amount of text that is not matched by
@var{regexp}, and leaves the point at the end of the first string of
text that does match.

  If the search is successful (i.e., if there is text that is matched by
@var{regexp}), then point is left at the end of that text, and the
function returns @code{t}.  

  If there is no text matched by @var{regexp}, then a
@code{search-failed} error is signaled.  However, if @var{noerror} is
@code{t}, then if the search fails, @code{re-search-forward} returns
@code{nil} without signaling an error.

  If @var{limit} is supplied (it must be a number or a marker), it will
be the maximum position in the buffer that the point can be skipped to.
Point will be left at or before @var{limit}.  Also, the match found must
not extend after that position.  This means that nothing can be found
beyond the @var{limit}.

  Also, if the search fails and @var{noerror} is neither @code{nil} nor
@code{t}, then point is moved to and left at the @var{limit} position;
and the function returns @code{nil}.

If @var{repeat} is supplied (it must be a positive number), then the
search is repeated that many times; and point left at the end of the
last match found.

  When called interactively, Emacs prompts you for @var{regexp} in the
minibuffer.

  In the example, point is located directly before the @samp{T}.  After
evaluating the form, it is located at the end of that line (between the
@samp{t} of @samp{hat} and before the newline).

@example
---------- Buffer: foo ----------
I read "@point{}The cat in the hat
comes back" twice.
---------- Buffer: foo ----------

(re-search-forward "[a-z]+" nil t 5)
     @result{} t

---------- Buffer: foo ----------
I read "The cat in the hat@point{}
comes back" twice.
---------- Buffer: foo ----------
@end example
@end deffn

@deffn Command re-search-backward regexp &optional limit noerror repeat
  This function searches backward in the current buffer for a string of
text that is matched by the regular expression @var{regexp}, leaving
point at the beginning of the first text found.  This function is the
exact analog of @code{re-search-forward}.
@end deffn

@defun string-match regexp string &optional start
  This function returns the index of the start of the first match for
the regular expression @var{regexp} in @var{string}, or @code{nil} if
there is no match.  If @var{start} is non-@code{nil}, the search is
started at that index in @var{string}.

  For example,

@example
(string-match "quick" "The quick brown fox jumped quickly.")
     @result{} 4
(string-match "quick" "The quick brown fox jumped quickly." 8)
     @result{} 27
@end example

@noindent
The index of the first character of the
string is 0, the index of the second character is 1, and so on.

  After this function returns, the index of the first character beyond
the match is available as @code{(match-end 0)}.

@example
(string-match "quick" "The quick brown fox jumped quickly." 8)
     @result{} 27

(match-end 0)
     @result{} 32
@end example

  The @code{match-end} function is described along with
@code{match-beginning}; @pxref{Match Data}.
@end defun

@defun looking-at regexp
  This function determines whether the text in the current buffer
directly following the point matches the regular expression
@var{regexp}.  ``Directly following'' means precisely that: the search
is ``anchored'' and it must succeed starting with the first character
following the point.  The result is @code{t} if so, @code{nil}
otherwise.

  Point is not moved, but the match data is updated and can be used with
@code{match-beginning} or @code{match-end}.

  In the example, the point is located directly before the @samp{T}.  If it
were anywhere else, the result would have been @code{nil}.

@example
---------- Buffer: foo ----------
I read "@point{}The cat in the hat
comes back" twice.
---------- Buffer: foo ----------

(looking-at "The cat in the hat$")
     @result{} t
@end example
@end defun

@ignore
@deffn Command delete-matching-lines regexp
  This function is identical to @code{delete-non-matching-lines}, save
that it deletes what @code{delete-non-matching-lines} keeps.

In the example below, the point is located on the first line of text.

@example
---------- Buffer: foo ----------
We hold these truths
to be self-evident,
that all men are created
equal, and that they are
---------- Buffer: foo ----------

(delete-matching-lines "the")
     @result{} nil

---------- Buffer: foo ----------
to be self-evident,
that all men are created
---------- Buffer: foo ----------
@end example
@end deffn

@deffn Command flush-lines regexp
  This function is the same as @code{delete-matching-lines}.
@end deffn

@defun delete-non-matching-lines regexp
  This function deletes all lines following the point which don't
contain a match for the regular expression @var{regexp}.
@end defun

@deffn Command keep-lines regexp
  This function is the same as @code{delete-non-matching-lines}.
@end deffn

@deffn Command how-many regexp
  This function counts the number of matches for @var{regexp} there are in
the current buffer following the point.  It prints this number in
the echo area, returning the string printed.
@end deffn

@deffn Command count-matches regexp
  This function is a synonym of @code{how-many}.
@end deffn

@deffn Command list-matching-lines regexp nlines
This function is a synonym of @code{occur}.
Show all lines following point containing a match for @var{regexp}.
Display each line with @var{nlines} lines before and after,
or @code{-}@var{nlines} before if @var{nlines} is negative.
@var{nlines} defaults to @code{list-matching-lines-default-context-lines}.
Interactively it is the prefix arg.

The lines are shown in a buffer named @samp{*Occur*}.
It serves as a menu to find any of the occurrences in this buffer.
@kbd{C-h m} (@code{describe-mode} in that buffer gives help.
@end deffn

@defopt list-matching-lines-default-context-lines
Default value is 0.
Default number of context lines to include around a @code{list-matching-lines}
match.  A negative number means to include that many lines before the match.
A positive number means to include that many lines both before and after.
@end defopt
@end ignore

@node Replacement, Match Data, Regular Expression Searching, Searching and Matching
@section Replacement
@cindex replacement

  Emacs has several replacement commands for interactive use.  For a
description of these, @pxref{Replace, , Replacement Commands, emacs, The
GNU Emacs Manual}.

  The commands include:

@table @code
@item  replace-regexp
This function replaces every match of @var{regexp} occurring between
point and the maximum point by @var{replacement}, which must be a
string.

@item  replace-string
This function replaces occurrences of @var{string} with @var{replacement}.
@end table

The following function replaces characters, not strings, but is included
here since it involves replacement.

@defun subst-char-in-region start end old-char new-char &optional noundo
@cindex replace characters
  This function replaces all occurrences of the character @var{old-char}
with the character @var{new-char} in the region of the current buffer
defined by @var{start} and @var{end}.

@cindex Outline mode
@cindex undo avoidance
  If @var{noundo} is non-@code{nil}, then @code{subst-char-in-region}
does not record the change for undo and does not mark the buffer as
modified.  This optional argument is used for obscure purposes, for
example, in Outline mode to change visible lines to invisible lines and
vice versa.

  @code{subst-char-in-region} does not move point and returns
@code{nil}.

@example
---------- Buffer: foo ----------
This is the contents of the buffer before.
---------- Buffer: foo ----------

(subst-char-in-region 1 20 ?i ?X)
     @result{} nil

---------- Buffer: foo ----------
ThXs Xs the contents of the buffer before.
---------- Buffer: foo ----------
@end example
@end defun

@ignore

@deffn Command replace-regexp regexp replacement delimited

The action of this function is to replace every match of @var{regexp}
occurring between point and the maximum point by @var{replacement}, which
must be a string.  The special treatment of @samp{\} in @code{replacement}
is the same as for @code{replace-match}.

  If @var{delimited} is non-@code{nil}, then it replaces only matches
surrounded by word boundaries.

  The case of the replacement text will be determined by the same rules
that @code{replace-match} uses.
@end deffn

@deffn Command replace-string string replacement &optional delimited
  This function replaces occurrences of @var{string} with @var{replacement}.
@end deffn

@defvar query-replace-help
  The value of this variable is a help message to print when the user types
@kbd{?} in @code{query-replace}.
@end defvar

@end ignore

@node Match Data, Standard Regexps, Replacement, Searching and Matching
@section Match Data

  Emacs keeps track of the positions of the start and end of segments of
text found during a regular expression search.  This means, for example,
that you can search for a complex pattern, such as  a date in an rmail
message, and extract different parts of it.  

@defun match-beginning count
  This function returns the position of the start of text matched by the
last regular expression searched for.  @var{count}, a number, specifies
which subexpression to return the start position of.  If @var{count} is
zero, then it returns the position of the text matched by the whole
regexp.  If @var{count} is greater than zero, then the position of the
beginning of the text matched by the @var{count}'th subexpression is
returned, regardless of whether it was used in the final match.

Subexpressions are those expressions grouped inside of parentheses,
@samp{\(@dots{}\)}.  The @var{count}'th subexpression is found by counting
occurances of @samp{\(} from the beginning of the whole regular
expression.  The first subexpression is 1, the second is 2, and so on.

The @code{match-end} function is similar to the @code{match-beginning}
function except that it returns the position of the end of the matched
text.

  (In the example, the positions in the text are numbered to make the
results more apparent.)

@example
(string-match "\\(qu\\)\\(ick\\)" "The quick brown fox jumped quickly.")
     @result{} 4                         ;^^^^^^^^^^
                                  ;0123456789      

(match-beginning 1)                            ; @r{The beginning of the match}
     @result{} 4                        ; @r{with @samp{qu} is at index 4.}

(match-beginning 2)                            ; @r{The beginning of the match}
     @result{} 6                        ; @r{with @samp{ick} is at index 6.}

(match-end 1)                                  ; @r{The end of the match}
     @result{} 6                        ; @r{with @samp{qu} is at index 6.}

(match-end 2)                                  ; @r{The end of the match}
     @result{} 9                        ; @r{with @samp{ick} is at index 9.}
@end example

@noindent
The @code{match-end} function is used in functions such as
@code{rmail-make-basic-summary-line}.

  Here is another example.  Before the form is evaluated, the point is
located at the beginning of the line.  After evaluating the search form,
it is located on the line between the space and the word @kbd{in}.  The
beginning of the entire match is at the 9th character of the buffer
(@samp{T}), and the beginning of the match for the first subexpression is
at the 13th character (@samp{c}).

@example
(list
  (re-search-forward "The \\(cat \\)")
  (match-beginning 0)
  (match-beginning 1))
@result{} (t 9 13)

---------- Buffer: foo ----------
I read "The cat @point{}in the hat comes back" twice.
        ^   ^
        9  13
---------- Buffer: foo ----------
@end example

@noindent
(Note that in this case, the index returned is a buffer position; the first
character of the buffer counts as 1.)

  It is essential that @code{match-beginning} be called after the search
desired and before any other searches are performed.
@code{match-beginning} may not give the desired results if called in a
separate command from the search.  The example below is the wrong way to
call @code{match-beginning}.  
@example
(re-search-forward "The \\(cat \\)")
     @result{} t
(foo)                   ; @r{Perhaps @code{foo} does more regexp searching.}
(match-beginning 0)
     @result{} 61              ; @r{Unexpected result!}
@end example

  See the discussion of @code{store-match-data} for an example of how to
save match data and restore the information after an intervening search.
@end defun

@defun match-end count
  This function returns the position of the end of text matched by
the last regular expression searched for.  This function is the exact
analog of @code{match-beginning}.
@end defun

@defun replace-match replacement &optional fixedcase literal
  This function replaces the text matched by the last search with
@var{replacement}.

@cindex case in replacements
  If @var{fixedcase} is non-@code{nil}, then the case of the replacement text
is not changed.  Otherwise the replacement text is converted to a
different case depending upon the capitalization of the text to be
replaced.  If the original text is all upper case, then the replacement
text is converted to upper case, except when all of the words in the
original text are only one character long.  In that event, the
replacement text is capitalized.  If @emph{all} of the words in
the original text are capitalized, then all of the words in the
replacement text will be capitalized.

  If @var{literal} is non-@code{nil}, then @var{replacement} is inserted
exactly as it is, the only alterations being a possible change in case.
If it is @var{nil} (the default), then the character @samp{\} is treated
specially.  If a @samp{\} appears in @var{replacement}, then it must be
followed by one of the following characters:

@table @asis
@item @kbd{\&}
@cindex @samp{&} in replacement
@kbd{\&} is replaced by the entire original text.

@item @kbd{\@var{N}}
@cindex @samp{\@var{n}} in replacement
@var{n} is a digit.
@kbd{\@var{n}} is replaced by the @var{n}'th subexpression
in the original regexp.  Subexpressions are those expressions grouped
inside of @samp{\(@dots{}\)}.

@item @kbd{\\}
@cindex @samp{\} in replacement
@samp{\\} is replaced by @samp{\}.
@end table

@code{replace-match} leaves the point at the end of the replacement text,
and returns @code{t}.
@end defun

@defun match-data
  This function returns a new list containing all the information on
what the last search matched.  The zero'th element is the beginning of
the match for the whole expression; the first element is the end of the
match for the expression.  The next two elements are the beginning and
end of the match for the first subexpression.  In general, the
2@var{n}'th element corresponds to @code{(match-beginning @var{n})}; and
element 2@var{n} + 1 corresponds to @code{(match-end @var{n})}.

All the elements are markers, or @code{nil} if there was no match for
that subexpression.  As with other search commands, there must be no
possibility of intervening searches between the call to a search and
the call to @code{match-data} that is intended to save the match-data for
that search.

@example
(match-data)
@result{}       (#<marker at 9 in foo> #<marker at 17 in foo>
          #<marker at 13 in foo> #<marker at 17 in foo>)
@end example
@end defun

@defun store-match-data match-list
  This function sets the internal data structure for the ``last search
match'' to the elements of @var{match-list}.  @var{match-list} should have
been created by calling @code{match-data} previously.

Together with @code{match-data}, @code{store-match-data} may be used to
avoid changing the @code{match-data} if you do a regexp search.  This is
useful when such searches occur in subroutines whose callers may not
expect searches to go on.

The following example illustrates the canonical use of these two
functions.

@example
(let ((data (match-data)))
  (unwind-protect
      ... ; @r{May change the original match data.}
    (store-match-data data)))
@end example

  All asynchronous process functions (filters and sentinels) and some
modes that use @code{recursive-edit} should save and restore the
match data if they do a search or if they let a user make a
search.  Here is a function which will restore the match data if the
buffer associated with it still exists.

@example
(defun restore-match-data (data)
  "Restore the match data DATA unless the buffer is missing."
  (catch 'foo
    (let ((d data))
      (while d
        (and (car d)
             (null (marker-buffer (car d)))
             ;; match-data buffer is deleted.
             (throw 'foo nil))
        (setq d (cdr d)))
      (store-match-data data)
      )))
@end example
@end defun

@node Standard Regexps, Searching and Case, Match Data, Searching and Matching
@section Standard Regular Expressions Used in Editing
@cindex regular expressions used standardly in editing
@cindex standard regular expressions used in editing

@defvar page-delimiter
This is the regexp describing line-beginnings that separate pages.  The
default value is @code{"^\014"} (i.e., @code{"^^L"} or @code{"^\C-l"}).
@end defvar

@defvar paragraph-separate
This is the regular expression for the beginning of a line that separates
paragraphs.  (If you change this, you may have to change
@code{paragraph-start} also.)  The default value is @code{"^[ \t\f]*$"},
which is a line that consists entirely of spaces, tabs, and form feeds.
@end defvar

@defvar paragraph-start
This is the regular expression for the beginning of a line that starts
@emph{or} separates paragraphs.  The default value is @code{"^[
\t\n\f]"}, which means any number of spaces, tabs, newlines, and form
feeds.
@end defvar

@defvar sentence-end
This is the regular expression describing the end of a sentence.  All
paragraph boundaries also end sentences, regardless.  Default value is
@code{"[.?!][]\"')@}]*\\($\\|\t\\| \\)[ \t\n]*"}.  This means a period,
question mark or exclamation mark, followed by a closing brace, followed
by tabs, spaces or new lines.

For a full description of this regular expression, @pxref{Complex Regexp Example}.
@end defvar

@node Searching and Case,  , Standard Regexps, Searching and Matching
@section Searching and Case

@cindex searching and case

  By default, searches in Emacs ignore the case of the text they
are searching through; if you specify searching for @samp{FOO}, then
@samp{Foo} and @samp{foo} are also considered a match.  Regexps, and in
particular character sets, are included: @samp{[aB]} would match @samp{a}
or @samp{A} or @samp{b} or @samp{B}.@refill

  If you do not want this feature, set the variable
@code{case-fold-search} to @code{nil}.  Then all letters must match
exactly, including case.  This is a per-buffer-local variable; altering
the variable affects only the current buffer.  (@xref{Buffer Local
Variables}.)  Alternatively, you may change the value of
@code{default-case-fold-search}, which is the default value of
@code{case-fold-search} for buffers that do not override it.

@defopt case-replace
  This variable determines whether @code{query-replace} should
preserve case in replacements.  If the variable is @code{nil}, then case
need not be preserved.
@end defopt

@defopt case-fold-search
  This buffer-local variable determines whether searches should ignore
case.  If the variable is @code{nil} they will not, if it is @code{t},
then they will ignore case.
@end defopt

@defvar default-case-fold-search
  The value of this variable is the default value for
@code{case-fold-search} in buffers that do not override it.  This is the
same as @code{(default-value 'case-fold-search)}.
@end defvar