syntax.texi

@c -*-texinfo-*-
@c This is part of the GNU Emacs Lisp Reference Manual.
@c Copyright (C) 1990-1995, 1998-1999, 2001-2015 Free Software
@c Foundation, Inc.
@c See the file elisp.texi for copying conditions.
@node Syntax Tables
@chapter 構文テーブル
@cindex 構文解析
@cindex 構文テーブル
@cindex テキストの解析

@dfn{構文テーブル}（syntax table）は、
各文字の構文的なテキスト上の機能を指定する。
これは、
単語やシンボルなどの構文要素がどこで始まりどこで終るかを調べるために使う。
この情報は、フォントロックモード（@pxref{Font Lock
Mode}）や、多数の複雑な移動コマンド（@pxref{Motion}）を含む、
多くのEmacsの機能で使われる。

@menu
* Basics: Syntax Basics.     Basic concepts of syntax tables.
* Syntax Descriptors::       How characters are classified.
* Syntax Table Functions::   How to create, examine and alter syntax tables.
* Syntax Properties::        Overriding syntax with text properties.
* Motion and Syntax::        Moving over characters with certain syntaxes.
* Parsing Expressions::      Parsing balanced expressions
                                using the syntax table.
* Syntax Table Internals::   How syntax table information is stored.
* Categories::               Another way of classifying character syntax.
@end menu

@node Syntax Basics
@section 構文テーブルの概念

構文テーブルは、
@dfn{構文クラス}を参照するのに使われるデータ構造である。
構文テーブルはLispプログラムによって、テキストを解析し、移動するのに使われる。

内部的には、構文テーブルは文字テーブル（@pxref{Char-Tables}）である。
@var{c}で添字付けられる要素は、コードが@var{c}である文字について記述する。
要素の値は、当該文字の構文上の機能を符号化したconsセルである。
詳細は、@xref{Syntax Table Internals}。
しかし、構文テーブルを参照するのに通常は@code{aset}や@code{aref}は使わず、
より高位な関数@code{char-syntax}および@code{modify-syntax-entry}を利用する。
（これらは@ref{Syntax Table Functions}で述べられる。）

@defun syntax-table-p object
この関数は、@var{object}が構文テーブルならば@code{t}を返す。
@end defun

各バッファには独自のメジャーモードがあり、
各メジャーモードはさまざまな文字の構文クラスを独自に扱います。
たとえば、lispモードでは文字@samp{;}はコメントを始めますが、
Cモードでは文を終らせます。
このような多様性を扱うために、
Emacsは各バッファごとにローカルな構文テーブルを選びます。
典型的には、各メジャーモードに独自の構文テーブルがあり、
そのモードを使っているバッファに当該構文テーブルをインストールします。
For example, the variable @code{emacs-lisp-mode-syntax-table} holds
the syntax table used by Emacs Lisp mode, and
@code{c-mode-syntax-table} holds the syntax table used by C mode.
この構文テーブルを変更すると、
同じモードのバッファだけでなく将来そのモードになったバッファでも
構文を変更してしまいます。
類似したモードでは1つの構文テーブルを共有することがあります。
構文テーブルの設定方法の例については、@xref{Example Major Modes}。

@cindex standard syntax table
@cindex inheritance, syntax table
構文テーブルは、@dfn{親の構文テーブル}と呼ばれる別の構文テーブルを@dfn{継承}できる。
構文テーブルは幾つかの文字の構文クラスを指定せずに残すことができ、
それらの構文クラスを親から『継承』させることができる。
このような文字は、構文クラスを親の構文テーブルから得る（@pxref{Syntax Class
Table}）。
Emacs defines a @dfn{standard syntax table}, which is the
default parent syntax table, and is also the syntax table used by
Fundamental mode.

@defun standard-syntax-table
この関数は、基本（fundamental）モードで使用する構文テーブルである
標準の構文テーブルを返す。
@end defun

  Syntax tables are not used by the Emacs Lisp reader, which has its
own built-in syntactic rules which cannot be changed.  (Some Lisp
systems provide ways to redefine the read syntax, but we decided to
leave this feature out of Emacs Lisp for simplicity.)

@node Syntax Descriptors
@section 構文記述子
@cindex 構文クラス

@dfn{構文クラス}は構文的な役割を記述する文字である。
各構文テーブルは、各文字の構文クラスを指定する。
ある構文テーブルでの文字のクラスと
他の構文テーブルでの当該文字のクラスとのあいだにはなんの関係も必要ありません。

各構文クラスはニーモニック文字で区別します。
ニーモニック文字は、
クラスを指定する必要があるときにクラス名として働きます。
通常、指定子の文字は当該クラスによく現れるものです。
しかしながら、指定子としての意味は不変で、その文字の現在の構文とは独立です。
Thus,
@samp{\} as a designator character always means ``escape character''
syntax, regardless of whether the @samp{\} character actually has that
syntax in the current syntax table.
@ifnottex
@xref{Syntax Class Table}, for a list of syntax classes and their
designator characters.
@end ifnottex

@cindex 構文記述子
@dfn{構文記述子}は、構文クラス、（括弧のクラスの場合にのみ使われる）
釣り合う文字、フラグを指定するLisp文字列です。
When you want to
modify the syntax of a character, that is done by calling the function
@code{modify-syntax-entry} and passing a syntax descriptor as one of
its arguments (@pxref{Syntax Table Functions}).

最初の文字は、構文クラスを指定する文字（指定子）です。
2番目の文字はその文字にマッチする文字
（例えばLispでは@samp{(}にマッチする文字は@samp{)}）
ですが、使用しない場合には空白です。
そのあとに付加的な構文属性を指定するフラグ文字が続きます
（@pxref{Syntax Flags}）。

釣り合う文字やフラグが必要なければ、文字1つだけで十分です。

たとえば、Cモードにおける文字@samp{*}の構文記述子は
@code{". 23"}（句読点、釣り合う文字なし、
コメント開始の2番目の文字、コメント終了の最初の文字）であり、
@samp{/}は@samp{@w{. 14}}（句読点、釣り合う文字なし、
コメント開始の最初の文字、コメント終了の2番目の文字）です。

  Emacs also defines @dfn{raw syntax descriptors}, which are used to
describe syntax classes at a lower level.  @xref{Syntax Table
Internals}.

@menu
* Syntax Class Table::      Table of syntax classes.
* Syntax Flags::            Additional flags each character can have.
@end menu

@node Syntax Class Table
@subsection 構文クラス一覧
@cindex syntax class table

以下の一覧は、構文クラス、そのクラスを表す文字（指定子）、
そのクラスの意味、その使用例です。

@table @asis
@item 白文字 (Whitespace characters): @samp{@ } or @samp{-}
シンボルや単語を互いに区切る。
典型的には、白文字には他の構文上の意味はなく、
複数個の白文字は1つの白文字と構文的には等価である。
ほとんどすべてのメジャーモードでは、
空白、タブ、改行、ページ送りは白文字である。

This syntax class can be designated by either @w{@samp{@ }} or
@samp{-}.  Both designators are equivalent.

@item 単語構成文字 (Word constituents): @samp{w}
一般言語の単語の一部分である。
典型的には、プログラムの変数やコマンド名に使われる。
すべての大英文字、小英文字、数字文字は、典型的には単語構成文字である。

@item シンボル構成文字 (Symbol constituents): @samp{_}
単語構成文字に加えて、変数やコマンド名に使われる追加の文字である。
たとえば、英単語の一部分ではないがシンボル名に使える特定の文字を指定
するために、Lispモードではシンボル構成文字クラスを使う。
このような文字は@samp{$&*+-_<>}である。
標準のCでは、単語構成文字でなくてシンボルに使える唯一の文字は
下線（@samp{_}）である。

@item 句読点文字 (Punctuation characters): @samp{.}
一般言語の句読点として使われたり、
プログラム言語でシンボルを区切るために使われる文字である。
emacs-lispモードを含むほとんどのプログラム言語向けモードでは、
シンボル構成文字でも単語構成文字でもない少数の文字には別の用途があるため、
このクラスの文字はない。

@item 開き括弧文字 (Open parenthesis characters): @samp{(}
@itemx 閉じ括弧文字 (Close parenthesis characters): @samp{)}
開き／閉じ括弧文字は、文や式を囲む相異なる対として使われる文字である。
そのようなグループ化は、開き括弧文字で始まり閉じ括弧文字で終る。
各開き括弧文字は特定の閉じ括弧文字に対応し、その逆もいえる。
Emacsは、通常、閉じ括弧文字を挿入すると
対応する開き括弧文字を短時間指し示す。
@xref{Blinking}。

一般言語とC言語のコードでは、
括弧の対は、@samp{()}、@samp{[]}、@samp{@{@}}である。
Emacs Lispでは、リストとベクトルの区切り（@samp{()}と@samp{[]}）は、
括弧文字としてクラス分けされる。

@item 文字列クォート (String quotes): @samp{"}
LispやCを含む多くの言語で文字列定数を区切るために使われる。
同じ文字列クォート文字が文字列の最初と最後に現れる。

Emacsの構文解析機能では、文字列を1つの字句とみなす。
文字列内の文字の普通の構文的な意味は抑制される。

lisp向けのモードには文字列クォート文字が2つ、
ダブルクォート（@samp{"}）と縦棒（@samp{|}）がある。
@samp{|}はEmacs Lispでは使わないがCommon Lispで使う。
Cにも2つの文字列クォート文字、
文字列用のダブルクォートと文字定数用のシングルクォート（@samp{'}）がある。

一般言語のテキストはプログラム言語ではないので、文字列クォート文字はない。
英文でも引用符は用いるが、その内側の文字の普通の構文的な属性を
抑制したくないのである。

@item エスケープ構文文字 (Escape-syntax characters): @samp{\}
Cの文字列や文字定数で使われるようなエスケープシーケンスを開始する。
CとLispでは、文字@samp{\}はこのクラスに属する。
（Cではこの文字は文字列の内側だけで使われるが、
Cモードでつねにこのように扱っても問題ないことがわかった。）

@code{words-include-escapes}が@code{nil}以外であると、
このクラスの文字は単語の一部分と解釈される。
@xref{Word Motion}。

@item 文字クォート (Character quotes): @samp{/}
この文字は、
後続の1文字をクォートし、通常の構文上の意味を抑制する。
直後の1文字のみに影響するという点で、エスケープ文字と異なる。

@code{words-include-escapes}が@code{nil}以外であると、
このクラスの文字は単語の一部分と解釈される。
@xref{Word Motion}。

このクラスは、@TeX{}モードのバックスラッシュに使われる。

@item 対になった区切り (Paired delimiters): @samp{$}
文字列クォート文字と同様であるが、
区切り文字のあいだにある文字の構文上の属性を抑制しない点が異なる。
現在、対になった区切りは@TeX{}モードのみで使い、
数学モードに出入りする@samp{$}である。

@item 式前置子 (Expression prefixes): @samp{'}
式のまえに現れると
式の一部であるとみなされる構文上の演算子に使われる。
lisp向けのモードでは、（クォートする）アポストロフ@samp{'}、
（マクロで使う）コンマ@samp{,}、
（ある種のデータの入力構文に使われる）@samp{#}の文字がそうである。

@item コメント開始 (Comment starters): @samp{<}
@itemx コメント終了 (Comment enders): @samp{>}
@cindex コメントの構文
さまざまな言語でコメントを区切るために用いられる。
コメント文字は人間言語では存在しない。
Lispでは、セミコロン（@samp{;}）がコメントを開始し、改行またはフォームフィードがコメントを終了する。

@item 標準構文の継承 (Inherit standard syntax): @samp{@@}
この構文クラスは特定の構文を指定しない。
標準の構文テーブルで当該文字の構文を探す指定である。

@item 汎用コメント区切り (Generic comment delimiters): @samp{!}
特別な種類のコメントを
始めて終える文字である。
@emph{任意の}汎用コメント区切り文字は
@emph{任意の}汎用コメント区切り文字に対応するが、
普通のコメント開始文字／コメント終了文字には対応しない。
汎用コメント区切り文字同士のみで対応する。

この構文クラスは、主にテキスト属性@code{syntax-table}
（@pxref{Syntax Properties}）で使うことを意図したものである。
任意の範囲の文字がコメントを形成すると印を付けるには、
その範囲の先頭と末尾の文字にそれらが汎用コメント区切りであることを
識別する属性@code{syntax-table}を与える。

@item 汎用文字列区切り (Generic string delimiters): @samp{|}
この文字は、文字列を始めるか、終える。
このクラスは文字列クォートクラスと異なり、
@emph{任意の}汎用文字列区切りは他の汎用文字列区切りに対応し、
普通の文字列クォート文字には対応しない。

この構文クラスは、主にテキスト属性@code{syntax-table}
（@pxref{Syntax Properties}）で使うことを意図したものである。
任意の範囲の文字が文字列定数を形成すると印を付けるには、
その範囲の先頭と末尾の文字にそれらが汎用文字列区切りであることを
識別する属性@code{syntax-table}を与える。
@end table

@node Syntax Flags
@subsection 構文フラグ
@cindex 構文フラグ

構文テーブルの各文字には、構文クラスに加えて、フラグも指定できます。
文字@samp{1}、@samp{2}、@samp{3}、@samp{4}、@samp{b}、@samp{c}、@samp{n}、@samp{p}で
表現される８つの可能なフラグがあります。

@samp{p}を除くすべてのフラグは、コメント区切りの記述に使われる。
数字フラグは、２文字からなるコメント区切りに使われる。
すなわち、文字はクラスで表される構文上の属性に@emph{加えて}、
コメント列の一部分でもありえることを示す。
フラグはクラスや他のフラグとは独立であり、
Cモードの@samp{*}のような、句読点文字である@emph{と同時に}、
コメント開始列の2番目の文字@samp{/*}@emph{でも}あり、
コメント終了列の最初の文字@samp{*/}でもあるような
文字のためにあります。
@samp{b}、@samp{c}、@samp{n}は、コメント区切り文字の
性質を示すのに使われます。

文字@var{c}に対して可能なフラグとそれらの意味を以下に示します。

@itemize @bullet
@item
@samp{1}は、@var{c}が2文字のコメント開始列を始めることを意味する。

@item
@samp{2}は、@var{c}がそのような列の2番目の文字であることを意味する。

@item
@samp{3}は、@var{c}が2文字のコメント終了列を始めることを意味する。

@item
@samp{4}は、@var{c}がそのような列の2番目の文字であることを意味する。

@item
@samp{b}は、コメント区切りとしての@var{c}が
もう1つの『b』形式のコメントに属することを意味する。
２文字からなるコメント開始列では、このフラグは２文字目のみ意味がある。
２文字からなるコメント終了列では、このフラグは１文字目のみ意味がある。

@item
@samp{c} means that @var{c} as a comment delimiter belongs to the
alternative ``c'' comment style.  For a two-character comment
delimiter, @samp{c} on either character makes it of style ``c''.

@item
@samp{n} on a comment delimiter character specifies
that this kind of comment can be nested.  For a two-character
comment delimiter, @samp{n} on either character makes it
nestable.

@cindex comment style
Emacsでは、任意の1つの構文テーブルで2つの形式のコメントを同時に扱える。
A comment style is a set of flags @samp{b}, @samp{c}, and
@samp{n}, so there can be up to 8 different comment styles.
コメント構文の各形式には、独自の開始列と独自の終了列がある。
各コメントはどちらか1つの形式である必要がある。
したがって、『bn』形式のコメント開始列で始まるものは、
『bn』形式のコメント終了列で終る必要がある。

C++向けの適切なコメント構文の設定はつぎのとおりである。

@table @asis
@item @samp{/}
@samp{124}
@item @samp{*}
@samp{23b}
@item 改行
@samp{>}
@end table

これは4つのコメント区切り列を定義する。

@table @asis
@item @samp{/*}
2文字目の@samp{*}にはフラグ@samp{b}があるので、
これは『b』形式のコメント開始列である。

@item @samp{//}
2文字目の@samp{/}にはフラグ@samp{b}がないので、
これは『a』形式のコメント開始列である。

@item @samp{*/}
2文字目の@samp{*}にはフラグ@samp{b}があるので、
これは『b』形式のコメント終了列である。

@item 改行
改行にはフラグ@samp{b}がないので、
これは『a』形式のコメント終了列である。
@end table

@item
@samp{p}は、Lisp構文向けの追加の『前置文字』を示す。
これらの文字は式のあいだに現れるときには白文字として扱う。
式の内側に現れると、それらの通常の構文コードに従って扱われる。

関数@code{backward-prefix-chars}は後方へ向けて移動するときには、
構文クラスが式前置子（@samp{'}）である文字に加えて
これらの文字も飛び越す。
@xref{Motion and Syntax}。
@end itemize

@node Syntax Table Functions
@section 構文テーブル向け関数

本節では、構文テーブルを作成／参照／変更するための関数について述べます。

@defun make-syntax-table &optional table
この関数は、新たな構文テーブルを作成する。
@var{table}が@code{nil}でない場合、新しい構文テーブルの親は@var{table}である。
それ以外の場合は親の構文テーブルは標準構文テーブルである。

In the new syntax table, all characters are initially given the
``inherit'' (@samp{@@}) syntax class, i.e., their syntax is inherited
from the parent table (@pxref{Syntax Class Table}).
@end defun

@defun copy-syntax-table &optional table
この関数は、構文テーブル@var{table}のコピーを作成しそれを返す。
@var{table}を指定しないと（あるいは@code{nil}）、
現在の構文テーブルのコピーを返す。
@var{table}が構文テーブルでないとエラーを通知する。
@end defun

@deffn Command modify-syntax-entry char syntax-descriptor  &optional table
@cindex syntax entry, setting
この関数は、文字@var{char}の構文指定を
構文記述子@var{syntax-descriptor}とする。
@var{char} must be a character, or a cons
cell of the form @code{(@var{min} . @var{max})}; in the latter case,
the function sets the syntax entries for all characters in the range
between @var{min} and @var{max}, inclusive.

構文テーブル@var{table}においてのみ構文を変更し、
他の構文テーブルは変更しない。

The argument @var{syntax-descriptor} is a syntax descriptor, i.e., a
string whose first character is a syntax class designator and whose
second and subsequent characters optionally specify a matching
character and syntax flags.  @xref{Syntax Descriptors}.  An error is
signaled if @var{syntax-descriptor} is not a valid syntax descriptor.

この関数はつねに@code{nil}を返す。
当該構文テーブルにおけるこの文字に対する古い構文情報は破棄される。

@example
@group
@exdent @r{【例】}

;; @r{空白文字をクラス白文字にする}
(modify-syntax-entry ?\s " ")
     @result{} nil
@end group

@group
;; @r{@samp{$}を開き括弧文字にする}
;;   @r{対応する閉じる文字は@samp{^}である}
(modify-syntax-entry ?$ "(^")
     @result{} nil
@end group

@group
;; @r{@samp{^}を閉じ括弧文字にする}
;;   @r{対応する開く文字は@samp{$}である}
(modify-syntax-entry ?^ ")$")
     @result{} nil
@end group

@group
;; @r{@samp{/}を句読点文字にする}
;;   @r{コメント開始列の最初の文字、および、}
;;   @r{コメント終了列の2番目の文字にもする}
;;   @r{これはCモードで用いられる}
(modify-syntax-entry ?/ ". 14")
     @result{} nil
@end group
@end example
@end deffn

@defun char-syntax character
この関数は、文字@var{character}の構文クラスを指定子で表したもので返す
（@pxref{Syntax Class Table}）。
これは構文クラス@emph{のみ}を返し、釣り合う文字や構文フラグは返さない。

つぎの例はCモードにあてはまる。
(We use @code{string} to make
it easier to see the character returned by @code{char-syntax}.)

@example
@group
;; Space characters have whitespace syntax class.
(string (char-syntax ?\s))
     @result{} " "
@end group

@group
;; Forward slash characters have punctuation syntax.
;; Note that this @code{char-syntax} call does not reveal
;; that it is also part of comment-start and -end sequences.
(string (char-syntax ?/))
     @result{} "."
@end group

@group
;; Open parenthesis characters have open parenthesis syntax.
;; Note that this @code{char-syntax} call does not reveal that
;; it has a matching character, @samp{)}.
(string (char-syntax ?\())
     @result{} "("
@end group
@end example

@end defun

@defun set-syntax-table table
この関数は、@var{table}をカレントバッファの構文テーブルにする。
@var{table}を返す。
@end defun

@defun syntax-table
この関数は、現在の構文テーブル、つまり、
カレントバッファの構文テーブルを返す。
@end defun

@deffn Command describe-syntax &optional buffer
This command displays the contents of the syntax table of
@var{buffer} (by default, the current buffer) in a help buffer.
@end deffn

@defmac with-syntax-table table body@dots{}
This macro executes @var{body} using @var{table} as the current syntax
table.  It returns the value of the last form in @var{body}, after
restoring the old current syntax table.

Since each buffer has its own current syntax table, we should make that
more precise: @code{with-syntax-table} temporarily alters the current
syntax table of whichever buffer is current at the time the macro
execution starts.  Other buffers are not affected.
@end defmac

@node Syntax Properties
@section 構文属性
@kindex syntax-table @r{（テキスト属性）}

言語の構文を指定するに十分なほど構文テーブルに柔軟性がないときには、
バッファ内の特定の文字の出現に対して構文テーブルに優先する
テキスト属性@code{syntax-table}を指定できます。
@xref{Text Properties}。

テキスト属性@code{syntax-table}の正しい値はつぎのとおりです。

@table @asis
@item @var{syntax-table}
属性値が構文テーブルであると、
文字のこの出現に対する構文を判定するために
カレントバッファの構文テーブルのかわりにこのテーブルを用いる。

@item @code{(@var{syntax-code} . @var{matching-char})}
この形のコンスセルは、文字のこの出現の構文を指定する。

@item @code{nil}
属性が@code{nil}であると、通常どおり、
現在の構文テーブルから文字の構文を判定する。
@end table

@defvar parse-sexp-lookup-properties
これが@code{nil}以外であると、@code{forward-sexp}のような
構文を解析する関数は、テキスト属性による構文指定に注意を払う。
さもなければ、現在の構文テーブルのみを用いる。
@end defvar

@defvar syntax-propertize-function
This variable, if non-@code{nil}, should store a function for applying
@code{syntax-table} properties to a specified stretch of text.  It is
intended to be used by major modes to install a function which applies
@code{syntax-table} properties in some mode-appropriate way.

The function is called by @code{syntax-ppss} (@pxref{Position Parse}),
and by Font Lock mode during syntactic fontification (@pxref{Syntactic
Font Lock}).  It is called with two arguments, @var{start} and
@var{end}, which are the starting and ending positions of the text on
which it should act.  It is allowed to call @code{syntax-ppss} on any
position before @var{end}.  However, it should not call
@code{syntax-ppss-flush-cache}; so, it is not allowed to call
@code{syntax-ppss} on some position and later modify the buffer at an
earlier position.
@end defvar

@defvar syntax-propertize-extend-region-functions
This abnormal hook is run by the syntax parsing code prior to calling
@code{syntax-propertize-function}.  Its role is to help locate safe
starting and ending buffer positions for passing to
@code{syntax-propertize-function}.  For example, a major mode can add
a function to this hook to identify multi-line syntactic constructs,
and ensure that the boundaries do not fall in the middle of one.

Each function in this hook should accept two arguments, @var{start}
and @var{end}.  It should return either a cons cell of two adjusted
buffer positions, @code{(@var{new-start} . @var{new-end})}, or
@code{nil} if no adjustment is necessary.  The hook functions are run
in turn, repeatedly, until they all return @code{nil}.
@end defvar

@node Motion and Syntax
@section 移動と構文
@cindex moving across syntax classes
@cindex skipping characters of certain syntax

本節では、特定の構文クラスを持つ文字を越えて移動するための
関数について述べます。

@defun skip-syntax-forward syntaxes &optional limit
この関数は、@var{syntaxes} (a string of syntax class
characters)
で指定される構文クラスを持つ文字を越えて
ポイントを前方へ向けて移動する。
バッファの末尾、（指定されていれば）@var{limit}の位置、
飛び越さない文字のいずれかに出会うと停止する。

If @var{syntaxes} starts with @samp{^}, then the function skips
characters whose syntax is @emph{not} in @var{syntaxes}.

戻り値は移動距離であり非負整数である。
@end defun

@defun skip-syntax-backward syntaxes &optional limit
この関数は、@var{syntaxes}で指定される構文クラスである文字を越えて
ポイントを後方へ向けて移動する。
バッファの先頭、（指定されていれば）@var{limit}の位置、
飛び越さない文字のいずれかに出会うと停止する。

If @var{syntaxes} starts with @samp{^}, then the function skips
characters whose syntax is @emph{not} in @var{syntaxes}.

戻り値は移動距離である。
それはゼロか負の整数である。
@end defun

@defun backward-prefix-chars
この関数は、式前置子構文の文字を飛び越えてポイントを後方へ向けて移動する。
式前置子クラスやフラグ@samp{p}を持つ文字を飛び越す。
@end defun

@node Parsing Expressions
@section 釣り合った式の解析
@cindex parsing expressions
@cindex scanning expressions

ここでは、括弧が対になっている@dfn{S式}（sexp）とも呼ばれる
釣り合った式を解析したり走査する関数について述べます。
following the terminology of Lisp, even though these functions can act
on languages other than Lisp.  Basically, a sexp is either a balanced
parenthetical grouping, a string, or a ``symbol'' (i.e., a sequence
of characters whose syntax is either word constituent or symbol
constituent).  However, characters in the expression prefix syntax
class (@pxref{Syntax Class Table}) are treated as part of the sexp if
they appear next to it.

構文テーブルで文字の解釈を制御することで、
LispモードではLispの式に対して、CモードではCの式に対して
これらの関数を用いることができます。
釣り合った式を飛び越えて移動するための便利な上位レベルの関数については、
@xref{List Motion}。

  A character's syntax controls how it changes the state of the
parser, rather than describing the state itself.  For example, a
string delimiter character toggles the parser state between
``in-string'' and ``in-code'', but the syntax of characters does not
directly say whether they are inside a string.  For example (note that
15 is the syntax code for generic string delimiters),

@example
(put-text-property 1 9 'syntax-table '(15 . nil))
@end example

@noindent
does not tell Emacs that the first eight chars of the current buffer
are a string, but rather that they are all string delimiters.  As a
result, Emacs treats them as four consecutive empty string constants.

@menu
* Motion via Parsing::       Motion functions that work by parsing.
* Position Parse::           Determining the syntactic state of a position.
* Parser State::             How Emacs represents a syntactic state.
* Low-Level Parsing::        Parsing across a specified region.
* Control Parsing::          Parameters that affect parsing.
@end menu

@node Motion via Parsing
@subsection Motion Commands Based on Parsing
@cindex motion based on parsing

  This section describes simple point-motion functions that operate
based on parsing expressions.

@defun scan-lists from count depth
この関数は、位置@var{from}から前方へ向けて@var{count}個の
釣り合った括弧のグループを走査する。
走査を停止した位置を返す。
@var{count}が負であると、後方へ向けて走査する。

@var{depth}が0以外であると、括弧の深さを@var{depth}から数え始める。
走査器は前方向・後方向にバッファを移動し、
括弧の深さが@var{count}回、0になる箇所で停止する。
したがって、@var{depth}に正の値を指定すると、
括弧のレベルを@var{depth}レベルだけ抜けることを意味する。
また、@var{depth}に負の値を指定すると、
@var{-depth}レベルの括弧の深さまで移動する効果を持つ。

@code{parse-sexp-ignore-comments}が@code{nil}以外であると、
コメントを無視して走査する。

走査が@var{count}回、括弧グループを移動する前に、
バッファ（あるいはその参照可能部分）の先頭や末尾に達してしまうと、
その地点における深さが0ならば、@code{nil}を返す。
そうでない場合は、@code{scan-error}エラーを通知する。
@end defun

@defun scan-sexps from count
この関数は、位置@var{from}から前方へ向けて@var{count}個のS式を走査する。
走査を終えた位置を返す。
@var{count}が負であると、後方へ向けて移動する。

@code{parse-sexp-ignore-comments}が@code{nil}以外であると、
コメントを無視して走査する。

走査が括弧によるグループの途中で
バッファ（あるいはその参照可能部分）の先頭や末尾に達すると、
エラーを通知する。
指定個数だけ数えるまえに括弧によるグループのあいだで
先頭や末尾に達した場合は@code{nil}を返す。
@end defun

@defun forward-comment count
この関数は、ポイントを前方へ向けて
@var{count}個の完全コメント（区切り開始列と区切り終了列を含む）と
その途中にある白文字を飛び越えて移動する。
@var{count}が負ならば後方へ向けて移動する。
コメントか白文字以外のものに出会うと停止し、当該箇所にポイントを置く。
This includes (for instance) finding the end
of a comment when moving forward and expecting the beginning of one.
If @var{count} comments are found as
expected, with nothing except whitespace between them, it returns
@code{t}; otherwise it returns @code{nil}.

This function cannot tell whether the ``comments'' it traverses are
embedded within a string.  If they look like comments, it treats them
as comments.

ポイントに続くすべてのコメントと白文字を飛び越えるには、
@code{(forward-comment (buffer-size))}として使う。
バッファ内のコメントの個数は@code{(buffer-size)}を越えるはずがないので、
引数として適している。
@end defun

@node Position Parse
@subsection Finding the Parse State for a Position
@cindex parse state for a position

  For syntactic analysis, such as in indentation, often the useful
thing is to compute the syntactic state corresponding to a given buffer
position.  This function does that conveniently.

@defun syntax-ppss &optional pos
This function returns the parser state that the parser would reach at
position @var{pos} starting from the beginning of the buffer.
@iftex
See the next section for
@end iftex
@ifnottex
@xref{Parser State},
@end ifnottex
for a description of the parser state.

The return value is the same as if you call the low-level parsing
function @code{parse-partial-sexp} to parse from the beginning of the
buffer to @var{pos} (@pxref{Low-Level Parsing}).  However,
@code{syntax-ppss} uses a cache to speed up the computation.  Due to
this optimization, the second value (previous complete subexpression)
and sixth value (minimum parenthesis depth) in the returned parser
state are not meaningful.

This function has a side effect: it adds a buffer-local entry to
@code{before-change-functions} (@pxref{Change Hooks}) for
@code{syntax-ppss-flush-cache} (see below).  This entry keeps the
cache consistent as the buffer is modified.  However, the cache might
not be updated if @code{syntax-ppss} is called while
@code{before-change-functions} is temporarily let-bound, or if the
buffer is modified without running the hook, such as when using
@code{inhibit-modification-hooks}.  In those cases, it is necessary to
call @code{syntax-ppss-flush-cache} explicitly.
@end defun

@defun syntax-ppss-flush-cache beg &rest ignored-args
This function flushes the cache used by @code{syntax-ppss}, starting
at position @var{beg}.  The remaining arguments, @var{ignored-args},
are ignored; this function accepts them so that it can be directly
used on hooks such as @code{before-change-functions} (@pxref{Change
Hooks}).
@end defun

  Major modes can make @code{syntax-ppss} run faster by specifying
where it needs to start parsing.

@defvar syntax-begin-function
If this is non-@code{nil}, it should be a function that moves to an
earlier buffer position where the parser state is equivalent to
@code{nil}---in other words, a position outside of any comment,
string, or parenthesis.  @code{syntax-ppss} uses it to further
optimize its computations, when the cache gives no help.
@end defvar

@node Parser State
@subsection Parser State
@cindex parser state

  A @dfn{parser state} is a list of ten elements describing the state
of the syntactic parser, after it parses the text between a specified
starting point and a specified end point in the buffer.  Parsing
functions such as @code{syntax-ppss}
@ifnottex
(@pxref{Position Parse})
@end ifnottex
return a parser state as the value.  Some parsing functions accept a
parser state as an argument, for resuming parsing.

  Here are the meanings of the elements of the parser state:

@enumerate 0
@item
The depth in parentheses, counting from 0.  @strong{Warning:} this can
be negative if there are more close parens than open parens between
the parser's starting point and end point.

@item
@cindex innermost containing parentheses
The character position of the start of the innermost parenthetical
grouping containing the stopping point; @code{nil} if none.

@item
@cindex previous complete subexpression
The character position of the start of the last complete subexpression
terminated; @code{nil} if none.

@item
@cindex inside string
Non-@code{nil} if inside a string.  More precisely, this is the
character that will terminate the string, or @code{t} if a generic
string delimiter character should terminate it.

@item
@cindex inside comment
@code{t} if inside a non-nestable comment (of any comment style;
@pxref{Syntax Flags}); or the comment nesting level if inside a
comment that can be nested.

@item
@cindex quote character
@code{t} if the end point is just after a quote character.

@item
The minimum parenthesis depth encountered during this scan.

@item
What kind of comment is active: @code{nil} if not in a comment or in a
comment of style @samp{a}; 1 for a comment of style @samp{b}; 2 for a
comment of style @samp{c}; and @code{syntax-table} for a comment that
should be ended by a generic comment delimiter character.

@item
The string or comment start position.  While inside a comment, this is
the position where the comment began; while inside a string, this is the
position where the string began.  When outside of strings and comments,
this element is @code{nil}.

@item
Internal data for continuing the parsing.  The meaning of this
data is subject to change; it is used if you pass this list
as the @var{state} argument to another call.
@end enumerate

  Elements 1, 2, and 6 are ignored in a state which you pass as an
argument to continue parsing, and elements 8 and 9 are used only in
trivial cases.  Those elements are mainly used internally by the
parser code.

  One additional piece of useful information is available from a
parser state using this function:

@defun syntax-ppss-toplevel-pos state
This function extracts, from parser state @var{state}, the last
position scanned in the parse which was at top level in grammatical
structure.  ``At top level'' means outside of any parentheses,
comments, or strings.

The value is @code{nil} if @var{state} represents a parse which has
arrived at a top level position.
@end defun

@node Low-Level Parsing
@subsection Low-Level Parsing

  The most basic way to use the expression parser is to tell it
to start at a given position with a certain state, and parse up to
a specified end position.

@defun parse-partial-sexp start limit &optional target-depth stop-before state stop-comment
This function parses a sexp in the current buffer starting at
@var{start}, not scanning past @var{limit}.  It stops at position
@var{limit} or when certain criteria described below are met, and sets
point to the location where parsing stops.  It returns a parser state
@ifinfo
(@pxref{Parser State})
@end ifinfo
describing the status of the parse at the point where it stops.

@cindex parenthesis depth
If the third argument @var{target-depth} is non-@code{nil}, parsing
stops if the depth in parentheses becomes equal to @var{target-depth}.
The depth starts at 0, or at whatever is given in @var{state}.

If the fourth argument @var{stop-before} is non-@code{nil}, parsing
stops when it comes to any character that starts a sexp.  If
@var{stop-comment} is non-@code{nil}, parsing stops when it comes to the
start of a comment.  If @var{stop-comment} is the symbol
@code{syntax-table}, parsing stops after the start of a comment or a
string, or the end of a comment or a string, whichever comes first.

If @var{state} is @code{nil}, @var{start} is assumed to be at the top
level of parenthesis structure, such as the beginning of a function
definition.  Alternatively, you might wish to resume parsing in the
middle of the structure.  To do this, you must provide a @var{state}
argument that describes the initial status of parsing.  The value
returned by a previous call to @code{parse-partial-sexp} will do
nicely.
@end defun

@node Control Parsing
@subsection Parameters to Control Parsing
@cindex parsing, control parameters

@defvar multibyte-syntax-as-symbol
If this variable is non-@code{nil}, @code{scan-sexps} treats all
non-@acronym{ASCII} characters as symbol constituents regardless
of what the syntax table says about them.  (However, text properties
can still override the syntax.)
@end defvar

@defopt parse-sexp-ignore-comments
@cindex コメントを飛び越える
値が@code{nil}以外であると、
本節の関数や@code{forward-sexp}、@code{scan-lists}および@code{scan-sexps}は、
コメントを白文字として扱う。
@end defopt

@vindex parse-sexp-lookup-properties
The behavior of @code{parse-partial-sexp} is also affected by
@code{parse-sexp-lookup-properties} (@pxref{Syntax Properties}).

You can use @code{forward-comment} to move forward or backward over
one comment or several comments.

@node Syntax Table Internals
@section 構文テーブルの内部
@cindex 構文テーブルの内部

構文テーブルは文字テーブル（@pxref{Char-Tables}）として実装されているが、
多くのLispプログラムは通常、構文テーブルの要素を直接には操作しない。
構文テーブルは構文データを構文記述（@pxref{Syntax Descriptors}）として保存せず、
内部的な形式を用いる。本節ではこの内部形式を説明するが、
この形式は構文属性として割り当てることもできる（@pxref{Syntax Properties}）。

@cindex syntax code
@cindex raw syntax descriptor
構文テーブルの各要素は、@dfn{生の構文記述子}、すなわち
@code{(@var{syntax-code} . @var{matching-char})}の
形のコンスセルである。
@sc{car}の@var{syntax-code}は、構文クラスと構文フラグを符号化する整数です。
釣り合う文字を指定してあると、
@sc{cdr}の@var{matching-char}は@code{nil}以外です。

つぎの表は、各構文クラスに対応する@var{syntax-code}の値です。

@multitable @columnfractions .2 .3 .2 .3
@item
@i{Code} @tab @i{Class} @tab @i{Code} @tab @i{Class}
@item
0 @tab whitespace @tab 8 @tab paired delimiter
@item
1 @tab punctuation @tab 9 @tab escape
@item
2 @tab word @tab 10 @tab character quote
@item
3 @tab symbol @tab 11 @tab comment-start
@item
4 @tab open parenthesis @tab 12 @tab comment-end
@item
5 @tab close parenthesis @tab 13 @tab inherit
@item
6 @tab expression prefix @tab 14 @tab generic comment
@item
7 @tab string quote @tab 15 @tab generic string
@end multitable

@noindent
たとえば、@samp{(}の普通の構文値は、@code{(4 . 41)}です。
（41は@samp{)}の文字コード。）

フラグは、最下位ビットから16番目のビットから始まる上位ビットに符号化します。
つぎの表は、各構文フラグとそれに対応する2の巾です。

@multitable @columnfractions .15 .3 .15 .3
@item
@i{Prefix} @tab @i{Flag} @tab @i{Prefix} @tab @i{Flag}
@item
@samp{1} @tab @code{(lsh 1 16)} @tab @samp{p} @tab @code{(lsh 1 20)}
@item
@samp{2} @tab @code{(lsh 1 17)} @tab @samp{b} @tab @code{(lsh 1 21)}
@item
@samp{3} @tab @code{(lsh 1 18)} @tab @samp{n} @tab @code{(lsh 1 22)}
@item
@samp{4} @tab @code{(lsh 1 19)}
@end multitable

@defun string-to-syntax desc
Given a syntax descriptor @var{desc} (a string), this function returns
the corresponding raw syntax descriptor.
@end defun

@defun syntax-after pos
This function returns the raw syntax descriptor for the character in
the buffer after position @var{pos}, taking account of syntax
properties as well as the syntax table.  If @var{pos} is outside the
buffer's accessible portion (@pxref{Narrowing, accessible portion}),
the return value is @code{nil}.
@end defun

@defun syntax-class syntax
This function returns the syntax code for the raw syntax descriptor
@var{syntax}.  More precisely, it takes the raw syntax descriptor's
@var{syntax-code} component, masks off the high 16 bits which record
the syntax flags, and returns the resulting integer.

If @var{syntax} is @code{nil}, the return value is returns @code{nil}.
This is so that the expression

@example
(syntax-class (syntax-after pos))
@end example

@noindent
evaluates to @code{nil} if @code{pos} is outside the buffer's
accessible portion, without throwing errors or returning an incorrect
code.
@end defun

@node Categories
@section カテゴリ
@cindex 文字のカテゴリ
@cindex カテゴリ、文字

@dfn{カテゴリ}（category）は、文字を構文的に分類する別の方法です。
必要に応じて複数のカテゴリを定義できて、
そうすると各文字に1つか複数のカテゴリを独立に設定できます。
構文クラスと異なり、カテゴリは互いに排他的ではありません。
1つの文字が複数のカテゴリに属することは普通にあります。

@cindex category table
各バッファには@dfn{カテゴリテーブル}（category table）があり、
どのカテゴリが定義済みでどの文字がどのカテゴリに属するかを記録しています。
各カテゴリテーブルはそれ独自のカテゴリ群を定義しますが、
それらは標準のカテゴリテーブルをコピーして普通は初期化されます。
そのため、すべてのモードで標準のカテゴリを使えます。

各カテゴリには名前があり、それは@w{@samp{ }}から@samp{~}までの
範囲の@acronym{ASCII}印字文字です。
@code{define-category}でカテゴリを定義するときにその名前を指定します。

@cindex category set
カテゴリテーブルは実際には文字テーブル（@pxref{Char-Tables}）です。
カテゴリテーブルの添字@var{c}の要素は、
@dfn{カテゴリ集合}（category set）です。
これはブールベクトルであり、文字@var{c}が属するカテゴリ群を表します。
このカテゴリ集合において、添字@var{cat}の要素が@code{t}であると、
@var{cat}は集合の要素であることを意味し、
当該文字@var{c}はカテゴリ@var{cat}に属することを意味します。

For the next three functions, the optional argument @var{table}
defaults to the current buffer's category table.

@defun define-category char docstring &optional table
この関数は、名前を@var{char}、説明文字列を@var{docstring}として
新たなカテゴリテーブル@var{table}を定義する。

Here's an example of defining a new category for characters that have
strong right-to-left directionality (@pxref{Bidirectional Display})
and using it in a special category table:

@example
(defvar special-category-table-for-bidi
  (let ((category-table (make-category-table))
	(uniprop-table (unicode-property-table-internal 'bidi-class)))
    (define-category ?R "Characters of bidi-class R, AL, or RLO"
                     category-table)
    (map-char-table
     #'(lambda (key val)
	 (if (memq val '(R AL RLO))
	     (modify-category-entry key ?R category-table)))
     uniprop-table)
    category-table))
@end example
@end defun

@defun category-docstring category &optional table
この関数は、カテゴリテーブル@var{table}のカテゴリ@var{category}の
説明文字列を返す。

@example
(category-docstring ?a)
     @result{} "ASCII"
(category-docstring ?l)
     @result{} "Latin"
@end example
@end defun

@defun get-unused-category &optional table
この関数は、カテゴリテーブル@var{table}で現在定義されていない
新たなカテゴリ名（文字）を返す。
@var{table}において可能なすべてのカテゴリが使用済みであると@code{nil}を返す。
@end defun

@defun category-table
この関数は、カレントバッファのカテゴリテーブルを返す。
@end defun

@defun category-table-p object
この関数は、@var{object}がカテゴリテーブルであると@code{t}を返し、
さもなければ@code{nil}を返す。
@end defun

@defun standard-category-table
この関数は、標準のカテゴリテーブルを返す。
@end defun

@defun copy-category-table &optional table
この関数は、カテゴリテーブル@var{table}のコピーを作成しそれを返す。
@var{table}を指定しない（あるいは@code{nil}）と、
現在のカテゴリテーブルのコピーを返す。
@var{table}がカテゴリテーブルでないとエラーを通知する。
@end defun

@defun set-category-table table
この関数は、カレントバッファのカテゴリテーブルを@var{table}とする。
@var{table}を返す。
@end defun

@defun make-category-table
This creates and returns an empty category table.  In an empty category
table, no categories have been allocated, and no characters belong to
any categories.
@end defun

@defun make-category-set categories
この関数は、新たなカテゴリ集合、すなわち
文字列@var{categories}に指定したカテゴリで
内容を初期化したブールベクトルを返す。
@var{categories}の要素はカテゴリ名であること。
新たなカテゴリ集合では、これらの各カテゴリに対しては@code{t}を、
それ以外のカテゴリに対しては@code{nil}を設定する。

@example
(make-category-set "al")
     @result{} #&128"\0\0\0\0\0\0\0\0\0\0\0\0\2\20\0\0"
@end example
@end defun

@defun char-category-set char
この関数は、文字@var{char}に対するカテゴリ集合を返す。
これは、文字@var{char}が属するカテゴリ群を記録したブールベクトルである。
関数@code{char-category-set}は、カテゴリテーブルに
存在する同じブールベクトルを返すため、新たな領域を割り付けない。

@example
(char-category-set ?a)
     @result{} #&128"\0\0\0\0\0\0\0\0\0\0\0\0\2\20\0\0"
@end example
@end defun

@defun category-set-mnemonics category-set
この関数は、カテゴリ集合@var{category-set}を
この集合に属するすべてのカテゴリの名前からなる文字列に変換する。

@example
(category-set-mnemonics (char-category-set ?a))
     @result{} "al"
@end example
@end defun

@defun modify-category-entry char category &optional table reset
この関数は、カテゴリテーブル@var{table}（デフォルトはカレントバッファの
カテゴリテーブル）内の文字@var{char}のカテゴリ集合を変更する。
@var{char} can be a character, or a cons cell of the form
@code{(@var{min} . @var{max})}; in the latter case, the function
modifies the category sets of all characters in the range between
@var{min} and @var{max}, inclusive.

普通、カテゴリ集合に@var{category}を追加して変更する。
しかし、@var{reset}が@code{nil}以外であると@var{category}を削除する。
@end defun

@deffn Command describe-categories &optional buffer-or-name
This function describes the category specifications in the current
category table.  It inserts the descriptions in a buffer, and then
displays that buffer.  If @var{buffer-or-name} is non-@code{nil}, it
describes the category table of that buffer instead.
@end deffn