From d353c2a2c76a432f089adff3d937167d10e67611 Mon Sep 17 00:00:00 2001 From: Michal Vaner Date: Sat, 6 Sep 2008 14:37:14 +0200 Subject: [PATCH] Some sherlock independent docs put into libucw The originals are kept in place, need to decide what to do with them. Made translatable by asciidoc --- ucw/doc/Makefile | 2 +- ucw/doc/config.txt | 177 ++++++++++++++++++++++++++++++++++++++++++ ucw/doc/configure.txt | 54 +++++++++++++ 3 files changed, 232 insertions(+), 1 deletion(-) create mode 100644 ucw/doc/config.txt create mode 100644 ucw/doc/configure.txt diff --git a/ucw/doc/Makefile b/ucw/doc/Makefile index 5fdf0375..0397672e 100644 --- a/ucw/doc/Makefile +++ b/ucw/doc/Makefile @@ -2,7 +2,7 @@ DIRS+=ucw/doc -UCW_DOCS=fastbuf index basecode hash docsys conf mempool +UCW_DOCS=fastbuf index config configure basecode hash docsys conf mempool UCW_INDEX=$(o)/ucw/doc/def_index.html UCW_DOCS_HTML=$(addprefix $(o)/ucw/doc/,$(addsuffix .html,$(UCW_DOCS))) diff --git a/ucw/doc/config.txt b/ucw/doc/config.txt new file mode 100644 index 00000000..4672ee3d --- /dev/null +++ b/ucw/doc/config.txt @@ -0,0 +1,177 @@ +Configuration files +=================== + +This document describes run-time configuration of UCW using +config files. For compile-time configuration, see doc/configure. + +Terminology +----------- + +Configuration items of all modules are organized into sections. +The sections form a tree structure with top-level sections corresponding +to program modules. + +Each configuration item belongs to one of the following classes: + + 1. single value or a fixed-length array of values + 2. variable-length array of values + 3. subsection with several nested attributes + 4. list of nodes, each being an instance of a subsection + 5. bitmap of small integers (0..31) or fixed list of strings + 6. exceptions (items with irregular syntax; however, they always + appear as a sequence of strings, only the semantics differ) + +Both fixed- and variable-length arrays consist of items of the same +type. The basic types supported by the configuration mechanism are: + + 1. 32-bit integer + 2. 64-bit integer + 3. floating point number + 4. IP address + 5. string + 6. choice (one of a fixed list of strings) + +Program modules can define their own special types (such as network +masks or attribute names) and decide how are they parsed. + +Format of configuration files +----------------------------- + +Configuration files are text files that usually set one attribute per +line, though it is possible to split one assignment into multiple lines +and/or assign several attributes in one line. The basic format of an +assignment command is + + name value1 value2 ... valueN + +or + + name=value1 value2 ... valueN + +The end of line means also end of a command unless it is preceded by a +backslash. On the other hand, a semicolon terminates the command and +another command can start after the semicolon. A hash starts a comment +that lasts until the end of the line. A value can be enclosed in +apostrophes or quotation marks and then it can contain spaces and/or +control characters, otherwise the first space or control character +denotes the end of the value. Values enclosed in quotation marks are +interpreted as C-strings. For example, the following are valid +assignment commands: + + Database "main db\x2b"; Directory='index/'; Weights 100 20 30 \ + 40 50 80 # a comment that is ignored + +Numerical values can be succeeded by a unit. The following units are +supported: + + d=86400 k=1000 K=1024 + h=3600 m=1000000 M=1048576 + %=0.01 g=1000000000 G=1073741824 + +Attributes of a section or a list node can be set in two ways. First, +you can write the name of the section or list, open a bracket, and then +set the attributes inside the section. For example, + + Section1 { + Attr1 value1 + Attr2 value2 + ListNode { #creates a list and adds its first node + Attr3 value3 + Attr4 value4 + } + ListNode { Attr3=value5; Attr4=value6 } + #appends a new node; this is still the same syntax + } + +The second possibility is using a shorter syntax when all attributes of a +section are set on one line in a fixed order. The above example could +be as well written as + + Section1 { + Attr1 value1 + Attr2 value2 + ListNode value3 value4 + ListNode value5 value6 + } + +Of course, you cannot use the latter syntax when the attributes allow +variable numbers of parameters. The parser of the configuration files +checks this possibility. + +If you want to set a single attribute in some section, you can also +refer to the attribute as Section.Attribute. + +Lists support several operations besides adding a new node. You just +have to write a colon immediately after the attribute name, followed by +the name of the operation. The following operations are supported: + + List:clear # removes all nodes + List:append { attr1=value1; ... } # adds a new node at the end + List:prepend { attr1=value1; ... } # adds a new node at the beginning + List:remove { attr1=search1 } # find a node and delete it + List:edit { attr1=search1 } { attr1=value1; ... } + # find a node and edit it + List:after { attr1=search1 } { ... } # insert a node after a found node + List:before { attr1=search1 } { ... } # insert a node before a found node + List:copy { attr1=search1 } { ... } # duplicate a node and edit the copy + +You can specify several attributes in the search condition and the nodes +are tested for equality in all these attributes. In the editing +commands, you can either open a second block with overridden attributes, +or specify the new values using the shorter one-line syntax. + +The commands :clear, :append, and :prepend are also supported by var-length +arrays. The command :clear can also be used on string values. The following +operations can be used on bitmaps: :set, :remove, :clear, and :all. + +Including other files +--------------------- + +To include another file, use the command + + include another/file + +(Beware that this command has to be the last one on the line.) + +Command-line parameters +----------------------- + +The default configuration file (cf_def_file possibly overriden +by environment variable cf_env_file) is read before the program is started. +You can use a -C option to override the name of the configuration file. +If you use this parameter several times, then all those files are loaded +consecutively. A parameter -S can be used to execute a configuration +command directly (after loading the default or specified configuration +file). Example: + + bin/program -Ccf/my-config -S'module.trace=2;module.logfile:clear' ... + +If the program is compiled with debugging information, then one more +parameter `--dumpconfig` is supported. It prints all parsed configuration +items and exits. + +All these switches must be used before any other parameters of the +program. + +Preprocessing +------------- + +During compilation, all configuration files are pre-processed by a simple +C-like preprocessor, which supports `#ifdef`, `#ifndef`, `#if`, +`#elsif`, `#else` and `#endif` directives referring to compile-time +configuration variables. `#if` and `#elsif` can contain any Perl expression +where each `CONFIG_xyz` configuration variable is substituted to 0 or 1 +depending on its value. + +The preprocessor also substitutes `@VARIABLE@` by the value of the variable, +which must be defined. + +Caveats +------- + +Trying to access an unknown attribute causes an error, but unrecognized +top-level sections are ignored. The reason is that a common config file +is used for a lot of programs which recognize only their own sections. + +Names of sections, attributes and choices are case-insensitive. Units are +case-sensitive. diff --git a/ucw/doc/configure.txt b/ucw/doc/configure.txt new file mode 100644 index 00000000..5dd3c7ac --- /dev/null +++ b/ucw/doc/configure.txt @@ -0,0 +1,54 @@ +How to Configure Sherlock +========================= + +What can be configured +---------------------- +There are two different levels of configuring/customizing program +based on libucw: + + - runtime configuration in configuration files (see doc/config) + + - compile-time configuration: config switches set before compiling, selecting + optional features. + + - customization: a set of C declarations and functions, which define word + types, attributes and the ways to match them. In most cases, you should + use the "free" customization designed for searching in web pages and similar + documents. If you want just a simple search engine for indexing databases, + the "customs/bare" customization can be better. If you want to extend Sherlock + beyond that, you will probably need to write your own customization -- just + follow the structure of the "free" customization and the comments there. + + +How to configure +---------------- +To set up compilation for a given customization, possibly overriding its default +compile-time options, just run + + ./configure [