From e192c5185ebb7c75a90859d54c2aaf3872249470 Mon Sep 17 00:00:00 2001 From: Martin Mares Date: Tue, 13 Jul 2010 14:01:13 +0200 Subject: [PATCH] Imported doc/config{,ure} from Sherlock --- doc/config | 181 ++++++++++++++++++++++++++++++++++++++++++++++++++ doc/configure | 53 +++++++++++++++ 2 files changed, 234 insertions(+) create mode 100644 doc/config create mode 100644 doc/configure diff --git a/doc/config b/doc/config new file mode 100644 index 00000000..8f04e763 --- /dev/null +++ b/doc/config @@ -0,0 +1,181 @@ +Configuration files +=================== + +This document describes the language of the run-time configuration +files, which are parsed by libucw. For compile-time configuration, +see doc/configure. + +Terminology : +------------- + +Configuration items are organized into sections. The sections form a tree +structure with top-level sections corresponding to program modules. + +Each configuration item belongs to one of the following classes: + + 1. single value or a fixed-length array of values + 2. variable-length array of values + 3. subsection with several nested attributes + 4. list of nodes, each being an instance of a subsection + 5. bitmap of small integers (0..31) or fixed list of strings + 6. exceptions (items with irregular syntax; however, they always + appear as a sequence of strings, only the semantics differ) + +Both fixed- and variable-length arrays consist of items of the same +type. The basic types supported by the configuration mechanism are: + + 1. 32-bit integer + 2. 64-bit integer + 3. floating point number + 4. IP address + 5. string + 6. choice (one of a fixed list of strings) + +Program modules can define their own special types (such as network +masks or attribute names) and decide how are they parsed. + +Format of configuration files : +------------------------------- + +Configuration files are text files that usually set one attribute per +line, though it is possible to split one assignment into multiple lines +and/or assign several attributes in one line. The basic format of an +assignment command is + + name value1 value2 ... valueN + +or + + name=value1 value2 ... valueN + +The end of line means also end of a command unless it is preceded by a +backslash. On the other hand, a semicolon terminates the command and +another command can start after the semicolon. A hash starts a comment +that lasts until the end of the line. A value can be enclosed in +apostrophes or quotation marks and then it can contain spaces and/or +control characters, otherwise the first space or control character +denotes the end of the value. Values enclosed in quotation marks are +interpreted as C-strings. For example, the following are valid +assignment commands: + + Database "main db\x2b"; Directory='index/'; Weights 100 20 30 \ + 40 50 80 # a comment that is ignored + +Numerical values can be succeeded by a unit. The following units are +supported: + + d=86400 k=1000 K=1024 + h=3600 m=1000000 M=1048576 + %=0.01 g=1000000000 G=1073741824 + +Attributes of a section or a list node can be set in two ways. First, +you can write the name of the section or list, open a bracket, and then +set the attributes inside the section. For example, + + Section1 { + Attr1 value1 + Attr2 value2 + ListNode { #creates a list and adds its first node + Attr3 value3 + Attr4 value4 + } + ListNode { Attr3=value5; Attr4=value6 } + #appends a new node; this is still the same syntax + } + +The second possibility is using a shorter syntax when all attributes of a +section are set on one line in a fixed order. The above example could +be as well written as + + Section1 { + Attr1 value1 + Attr2 value2 + ListNode value3 value4 + ListNode value5 value6 + } + +Of course, you cannot use the latter syntax when the attributes allow +variable numbers of parameters. The parser of the configuration files +checks this possibility. + +If you want to set a single attribute in some section, you can also +refer to the attribute as Section.Attribute. + +Lists support several operations besides adding a new node. You just +have to write a colon immediately after the attribute name, followed by +the name of the operation. The following operations are supported: + + List:clear # removes all nodes + List:append { attr1=value1; ... } # adds a new node at the end + List:prepend { attr1=value1; ... } # adds a new node at the beginning + List:remove { attr1=search1 } # find a node and delete it + List:edit { attr1=search1 } { attr1=value1; ... } + # find a node and edit it + List:after { attr1=search1 } { ... } # insert a node after a found node + List:before { attr1=search1 } { ... } # insert a node before a found node + List:copy { attr1=search1 } { ... } # duplicate a node and edit the copy + +You can specify several attributes in the search condition and the nodes +are tested for equality in all these attributes. In the editing +commands, you can either open a second block with overridden attributes, +or specify the new values using the shorter one-line syntax. + +The commands :clear, :append, and :prepend are also supported by var-length +arrays. The command :clear can also be used on string values. The following +operations can be used on bitmaps: :set, :remove, :clear, and :all. + +Configuration files : +--------------------- + +The default configuration file is stored in cf/sherlock and it uses +the include directive to load nested configuration files: + + include another/file + +(Beware that this command has to be the last one on the line.) + +If you want to modify the configuration, the preferred form is by editing +the file cf/local, which is included at the end of cf/sherlock, so you +can override anything there. + +Command-line parameters : +------------------------- + +The default configuration file (cf_def_file possibly overriden +by environment variable cf_env_file) is read before the program is started. +You can use a -C option to override the name of the configuration file. +If you use this parameter several times, then all those files are loaded +consecutively. A parameter -S can be used to execute a configuration +command directly (after loading the default or specified configuration +file). Example: + + bin/program -Ccf/my-config -S'module.trace=2;module.logfile:clear' ... + +If the program is compiled with debugging information, then one more +parameter --dumpconfig is supported. It prints all parsed configuration +items and exits. + +All these switches must be used before any other parameters of the +program. + +Preprocessing : +--------------- + +During compilation, all configuration files are pre-processed by a simple +C-like preprocessor, which supports #ifdef, #ifndef, #if, #elsif, #else and #endif +directives referring to compile-time configuration variables. #if and #elsif +can contain any Perl expression where each CONFIG_xyz configuration variable +is substituted to 0 or 1 depending on its value. + +The preprocessor also substitutes `@VARIABLE@' by the value of the variable, +which must be defined. + +Caveats : +--------- + +Trying to access an unknown attribute causes an error, but unrecognized +top-level sections are ignored. The reason is that a common config file +is used for a lot of programs which recognize only their own sections. + +Names of sections, attributes and choices are case-insensitive. Units are +case-sensitive. diff --git a/doc/configure b/doc/configure new file mode 100644 index 00000000..355cc6c9 --- /dev/null +++ b/doc/configure @@ -0,0 +1,53 @@ +How to Configure Sherlock FIXME: update for stand-alone libucw +************************* + +1. What can be configured +~~~~~~~~~~~~~~~~~~~~~~~~~ +There are three different levels of configuring/customizing Sherlock: + + - runtime configuration in cf/sherlock and cf/local (see doc/config) + + - compile-time configuration: config switches set before compiling, selecting + optional features (see sherlock/default.cfg) + + - customization: a set of C declarations and functions, which define word + types, attributes and the ways to match them. In most cases, you should + use the "free" customization designed for searching in web pages and similar + documents. If you want just a simple search engine for indexing databases, + the "customs/bare" customization can be better. If you want to extend Sherlock + beyond that, you will probably need to write your own customization -- just + follow the structure of the "free" customization and the comments there. + + +2. How to configure +~~~~~~~~~~~~~~~~~~~ +To set up compilation for a given customization, possibly overriding its default +compile-time options, just run + + ./configure [