1 rizwank 1.1 .\"
2 .\" States manual page.
3 .\" Copyright (c) 1997-1998 Markku Rossi.
4 .\" Author: Markku Rossi <mtr@iki.fi>
5 .\"
6 .\" This file is part of GNU enscript.
7 .\"
8 .\" This program is free software; you can redistribute it and/or modify
9 .\" it under the terms of the GNU General Public License as published by
10 .\" the Free Software Foundation; either version 2, or (at your option)
11 .\" any later version.
12 .\"
13 .\" This program is distributed in the hope that it will be useful,
14 .\" but WITHOUT ANY WARRANTY; without even the implied warranty of
15 .\" MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
16 .\" GNU General Public License for more details.
17 .\"
18 .\" You should have received a copy of the GNU General Public License
19 .\" along with this program; see the file COPYING. If not, write to
20 .\" the Free Software Foundation, 59 Temple Place - Suite 330,
21 .\" Boston, MA 02111-1307, USA.
22 rizwank 1.1 .\"
23 .TH STATES 1 "Oct 23, 1998" "STATES" "STATES"
24
25 .SH NAME
26 states \- awk alike text processing tool
27
28 .SH SYNOPSIS
29 .B states
30 [\f3\-hvV\f1]
31 [\f3\-D \f2var\f3=\f2val\f1]
32 [\f3\-f \f2file\f1]
33 [\f3\-o \f2outputfile\f1]
34 [\f3\-p \f2path\f1]
35 [\f3\-s \f2startstate\f1]
36 [\f3\-W \f2level\f1]
37 [\f2filename\f1 ...]
38
39 .SH DESCRIPTION
40
41 \f3States\f1 is an awk-alike text processing tool with some state
42 machine extensions. It is designed for program source code
43 rizwank 1.1 highlighting and to similar tasks where state information helps input
44 processing.
45
46 At a single point of time, \f3States\f1 is in one state, each quite
47 similar to awk's work environment, they have regular expressions which
48 are matched from the input and actions which are executed when a match
49 is found. From the action blocks, \f3states\f1 can perform state
50 transitions; it can move to another state from which the processing is
51 continued. State transitions are recorded so \f3states\f1 can return
52 to the calling state once the current state has finished.
53
54 The biggest difference between \f3states\f1 and awk, besides state
55 machine extensions, is that \f3states\f1 is not line-oriented. It
56 matches regular expression tokens from the input and once a match is
57 processed, it continues processing from the current position, not from
58 the beginning of the next input line.
59
60 .SH OPTIONS
61 .TP 8
62 .B \-D \f2var\f3=\f2val\f3, \-\-define=\f2var\f3=\f2val\f3
63 Define variable \f2var\f1 to have string value \f2val\f1. Command
64 rizwank 1.1 line definitions overwrite variable definitions found from the config
65 file.
66 .TP 8
67 .B \-f \f2file\f3, \-\-file=\f2file\f3
68 Read state definitions from file \f2file\f1. As a default,
69 \f3states\f1 tries to read state definitions from file \f3states.st\f1
70 in the current working directory.
71 .TP 8
72 .B \-h, \-\-help
73 Print short help message and exit.
74 .TP 8
75 .B \-o \f2file\f3, \-\-output=\f2file\f3
76 Save output to file \f2file\f1 instead of printing it to
77 \f3stdout\f1.
78 .TP 8
79 .B \-p \f2path\f3, \-\-path=\f2path\f3
80 Set the load path to \f2path\f1. The load path defaults to the
81 directory, from which the state definitions file is loaded.
82 .TP 8
83 .B \-s \f2state\f3, \-\-state=\f2state\f3
84 Start execution from state \f3state\f1. This definition overwrites
85 rizwank 1.1 start state resolved from the \f3start\f1 block.
86 .TP 8
87 .B \-v, \-\-verbose
88 Increase the program verbosity.
89 .TP 8
90 .B \-V, \-\-version
91 Print \f3states\f1 version and exit.
92 .TP 8
93 .B \-W \f2level\f3, \-\-warning=\f2level\f3
94 Set the warning level to \f2level\f1. Possible values for \f2level\f1
95 are:
96 .RS 8
97 .TP 8
98 .B light
99 light warnings (default)
100 .TP 8
101 .B all
102 all warnings
103 .RE
104
105 .SH STATES PROGRAM FILES
106 rizwank 1.1
107 \f3States\f1 program files can contain on \f2start\f1 block,
108 \f2startrules\f1 and \f2namerules\f1 blocks to specify the initial
109 state, \f2state\f1 definitions and \f2expressions\f1.
110
111 The \f2start\f1 block is the main() of the \f3states\f1 program, it is
112 executed on script startup for each input file and it can perform any
113 initialization the script needs. It normally also calls the
114 \f3check_startrules()\f1 and \f3check_namerules()\f1 primitives which
115 resolve the initial state from the input file name or the data found
116 from the begining of the input file. Here is a sample start block
117 which initializes two variables and does the standard start state
118 resolving:
119 .PP
120 .RS
121 .nf
122 start
123 {
124 a = 1;
125 msg = "Hello, world!";
126 check_startrules ();
127 rizwank 1.1 check_namerules ();
128 }
129 .fi
130 .RE
131 .PP
132 Once the start block is processed, the input processing is continued
133 from the initial state.
134
135 The initial state is resolved by the information found from the
136 \f2startrules\f1 and \f2namerules\f1 blocks. Both blocks contain
137 regular expression - symbol pairs, when the regular expression is
138 matched from the name of from the beginning of the input file, the
139 initial state is named by the corresponding symbol. For example, the
140 following start and name rules can distinguish C and Fortran files:
141 .PP
142 .RS
143 .nf
144 namerules
145 {
146 /\.(c|h)$/ c;
147 /\.[fF]$/ fortran;
148 rizwank 1.1 }
149
150 startrules
151 {
152 /-\*- [cC] -\*-/ c;
153 /-\*- fortran -\*-/ fortran;
154 }
155 .fi
156 .RE
157 .PP
158 If these rules are used with the previously shown start block,
159 \f3states\f1 first check the beginning of input file. If it has
160 string \f3-*- c -*-\f1, the file is assumed to contain C code and the
161 processing is started from state called \f3c\f1. If the beginning of
162 the input file has string \f3-*- fortran -*-\f1, the initial state is
163 \f3fortran\f1. If none of the start rules matched, the name of the
164 input file is matched with the namerules. If the name ends to suffix
165 \f3c\f1 or \f3C\f1, we go to state \f3c\f1. If the suffix is
166 \f3f\f1 or \f3F\f1, the initial state is fortran.
167
168 If both start and name rules failed to resolve the start state,
169 rizwank 1.1 \f3states\f1 just copies its input to output unmodified.
170
171 The start state can also be specified from the command line with
172 option \f3\-s\f1, \f3\-\-state\f1.
173
174 State definitions have the following syntax:
175
176 .B state { \f2expr\f1 {\f2statements\f1} ... }
177
178 where \f2expr\f1 is: a regular expression, special expression or
179 symbol and \f2statements\f1 is a list of statements. When the
180 expression \f2expr\f1 is matched from the input, the statement block
181 is executed. The statement block can call \f3states\f1' primitives,
182 user-defined subroutines, call other states, etc. Once the block is
183 executed, the input processing is continued from the current intput
184 position (which might have been changed if the statement block called
185 other states).
186
187 Special expressions \f3BEGIN\f1 and \f3END\f1 can be used in the place
188 of \f2expr\f1. Expression \f3BEGIN\f1 matches the beginning of the
189 state, its block is called when the state is entered. Expression
190 rizwank 1.1 \f3END\f1 matches the end of the state, its block is executed when
191 \f3states\f1 leaves the state.
192
193 If \f2expr\f1 is a symbol, its value is looked up from the global
194 environment and if it is a regular expression, it is matched to the
195 input, otherwise that rule is ignored.
196
197 The \f3states\f1 program file can also have top-level expressions,
198 they are evaluated after the program file is parsed but before any
199 input files are processed or the \f2start\f1 block is evaluated.
200
201 .SH PRIMITIVE FUNCTIONS
202
203 .TP 8
204 .B call (\f2symbol\f3)
205 Move to state \f2symbol\f1 and continue input file processing from
206 that state. Function returns whatever the \f3symbol\f1 state's
207 terminating \f3return\f1 statement returned.
208 .TP 8
209 .B calln (\f2name\f3)
210 Like \f3call\f1 but the argument \f2name\f1 is evaluated and its value
211 rizwank 1.1 must be string. For example, this function can be used to call a
212 state which name is stored to a variable.
213 .TP 8
214 .B check_namerules ()
215 Try to resolve start state from \f3namerules\f1 rules. Function
216 returns \f31\f1 if start state was resolved or \f30\f1 otherwise.
217 .TP 8
218 .B check_startrules ()
219 Try to resolve start state from \f3startrules\f1 rules. Function
220 returns \f31\f1 if start state was resolved or \f30\f1 otherwise.
221 .TP 8
222 .B concat (\f2str\f3, ...)
223 Concanate argument strings and return result as a new string.
224 .TP 8
225 .B float (\f2any\f3)
226 Convert argument to a floating point number.
227 .TP 8
228 .B getenv (\f2str\f3)
229 Get value of environment variable \f2str\f1. Returns an empty string
230 if variable \f2var\f1 is undefined.
231 .TP 8
232 rizwank 1.1 .B int (\f2any\f3)
233 Convert argument to an integer number.
234 .TP 8
235 .B length (\f2item\f3, ...)
236 Count the length of argument strings or lists.
237 .TP 8
238 .B list (\f2any\f3, ...)
239 Create a new list which contains items \f2any\f1, ...
240 .TP 8
241 .B panic (\f2any\f3, ...)
242 Report a non-recoverable error and exit with status \f31\f1. Function
243 never returns.
244 .TP 8
245 .B print (\f2any\f3, ...)
246 Convert arguments to strings and print them to the output.
247 .TP 8
248 .B range (\f2source\f3, \f2start\f3, \f2end\f3)
249 Return a sub\-range of \f2source\f1 starting from position \f2start\f1
250 (inclusively) to \f2end\f1 (exclusively). Argument \f2source\f1 can
251 be string or list.
252 .TP 8
253 rizwank 1.1 .B regexp (\f2string\f3)
254 Convert string \f2string\f1 to a new regular expression.
255 .TP 8
256 .B regexp_syntax (\f2char\f3, \f2syntax\f3)
257 Modify regular expression character syntaxes by assigning new
258 syntax \f2syntax\f1 for character \f2char\f1. Possible values for
259 \f2syntax\f1 are:
260 .RS 8
261 .TP 8
262 .B 'w'
263 character is a word constituent
264 .TP 8
265 .B ' '
266 character isn't a word constituent
267 .RE
268 .TP 8
269 .B regmatch (\f2string\f3, \f2regexp\f3)
270 Check if string \f2string\f1 matches regular expression \f2regexp\f1.
271 Functions returns a boolean success status and sets sub-expression
272 registers \f3$\f2n\f1.
273 .TP 8
274 rizwank 1.1 .B regsub (\f2string\f1, \f2regexp\f3, \f2subst\f3)
275 Search regular expression \f2regexp\f1 from string \f2string\f1 and
276 replace the matching substring with string \f2subst\f1. Returns the
277 resulting string. The substitution string \f2subst\f1 can contain
278 \f3$\f2n\f1 references to the \f2n\f1:th parenthesized
279 sup-expression.
280 .TP 8
281 .B regsuball (\f2string\f1, \f2regexp\f3, \f2subst\f3)
282 Like \f3regsub\f1 but replace all matches of regular expression
283 \f2regexp\f1 from string \f2string\f1 with string \f2subst\f1.
284 .TP 8
285 .B require_state (\f2symbol\f3)
286 Check that the state \f2symbol\f1 is defined. If the required state
287 is undefined, the function tries to autoload it. If the loading
288 fails, the program will terminate with an error message.
289 .TP 8
290 .B split (\f2regexp\f3, \f2string\f3)
291 Split string \f2string\f1 to list considering matches of regular
292 rexpression \f2regexp\f1 as item separator.
293 .TP 8
294 .B sprintf (\f2fmt\f1, ...)
295 rizwank 1.1 Format arguments according to \f2fmt\f1 and return result as a
296 string.
297 .TP 8
298 .B strcmp (\f2str1\f3, \f2str2\f3)
299 Perform a case\-sensitive comparision for strings \f2str1\f1 and
300 \f2str2\f1. Function returns a value that is:
301 .RS 8
302 .TP 8
303 .B -1
304 string \f2str1\f1 is less than \f2str2\f1
305 .TP 8
306 .B 0
307 strings are equal
308 .TP 8
309 .B 1
310 string \f2str1\f1 is greater than \f2str2\f1
311 .RE
312 .TP 8
313 .B string (\f2any\f3)
314 Convert argument to string.
315 .TP 8
316 rizwank 1.1 .B strncmp (\f2str1\f3, \f2str2\f3, \f2num\f3)
317 Perform a case\-sensitive comparision for strings \f2str1\f1 and
318 \f2str2\f1 comparing at maximum \f2num\f3 characters.
319 .TP 8
320 .B substring (\f2str\f3, \f2start\f3, \f2end\f3)
321 Return a substring of string \f2str\f1 starting from position
322 \f2start\f1 (inclusively) to \f2end\f1 (exclusively).
323 .RE
324
325 .SH BUILTIN VARIABLES
326 .TP 8
327 .B $.
328 current input line number
329 .TP 8
330 .B $\f2n\f3
331 the \f2n\f1:th parenthesized regular expression sub-expression from the
332 latest state regular expression or from the \f3regmatch\f1 primitive
333 .TP 8
334 .B $`
335 everything before the matched regular rexpression. This is usable
336 when used with the \f3regmatch\f1 primitive; the contents of this
337 rizwank 1.1 variable is undefined when used in action blocks to refer the data
338 before the block's regular expression.
339 .TP 8
340 .B $B
341 an alias for \f3$`\f1
342 .TP 8
343 .B argv
344 list of input file names
345 .TP 8
346 .B filename
347 name of the current input file
348 .TP 8
349 .B program
350 name of the program (usually \f3states\f1)
351 .TP 8
352 .B version
353 program version string
354 .RE
355
356 .SH FILES
357 .nf
358 rizwank 1.1 .ta 4i
359 @DATADIR@/enscript/hl/*.st enscript's states definitions
360 .fi
361
362 .SH SEE ALSO
363 awk(1), enscript(1)
364
365 .SH AUTHOR
366 Markku Rossi <mtr@iki.fi> <http://www.iki.fi/~mtr/>
367
368 GNU Enscript WWW home page: <http://www.iki.fi/~mtr/genscript/>
|