]> mj.ucw.cz Git - libucw.git/log
libucw.git
23 years agorx_compile() can now compile with IGNORING CASE enabled too
Robert Spalek [Fri, 30 Mar 2001 13:07:05 +0000 (13:07 +0000)]
rx_compile() can now compile with IGNORING CASE enabled too
regex-test added

23 years agocards.c printed tags converted tolower
Robert Spalek [Fri, 30 Mar 2001 09:18:14 +0000 (09:18 +0000)]
cards.c printed tags converted tolower

23 years agoadded WT_NAMES from WORD_TYPE_NAMES temporarily, MJ: please check it
Robert Spalek [Fri, 30 Mar 2001 08:43:38 +0000 (08:43 +0000)]
added WT_NAMES from WORD_TYPE_NAMES temporarily, MJ: please check it

23 years agoMapping of zero-length files returns just a random non-zero address.
Martin Mares [Tue, 27 Mar 2001 17:41:01 +0000 (17:41 +0000)]
Mapping of zero-length files returns just a random non-zero address.

With this fix, empty indices are generated correctly.

23 years agoAdded optional work-arounds for path underflows and leading/trailing
Martin Mares [Tue, 27 Mar 2001 16:46:31 +0000 (16:46 +0000)]
Added optional work-arounds for path underflows and leading/trailing
spaces in URL's.

23 years agoAdded ASSERT checks for tag byte syntax. Didn't find any errors yet.
Martin Mares [Tue, 27 Mar 2001 16:29:07 +0000 (16:29 +0000)]
Added ASSERT checks for tag byte syntax. Didn't find any errors yet.

23 years agoLoad the default config file on first non-config option (several options
Martin Mares [Tue, 27 Mar 2001 11:02:42 +0000 (11:02 +0000)]
Load the default config file on first non-config option (several options
require config to be loaded).

23 years agoRemoved tempfile functions (nobody uses them and they probably belong
Martin Mares [Tue, 27 Mar 2001 10:57:33 +0000 (10:57 +0000)]
Removed tempfile functions (nobody uses them and they probably belong
to fastbuf.c anyway).

23 years agoAdded ABS macro.
Martin Mares [Tue, 27 Mar 2001 10:52:48 +0000 (10:52 +0000)]
Added ABS macro.

23 years agoRemoved FIXME.
Martin Mares [Tue, 27 Mar 2001 10:29:51 +0000 (10:29 +0000)]
Removed FIXME.

23 years agoSlow case of b(get|put)_utf8 no longer inline.
Martin Mares [Tue, 27 Mar 2001 10:28:31 +0000 (10:28 +0000)]
Slow case of b(get|put)_utf8 no longer inline.

23 years agoCVS repository cleaned up a bit:
Robert Spalek [Thu, 22 Mar 2001 15:56:47 +0000 (15:56 +0000)]
CVS repository cleaned up a bit:
gather/{objdump.c,dumpconfig.[ch]} and indexer/idxdump.c --> utils
utils/lfstest.c deleted
filter/ftest is not compiled by default
rule for making lib/lfs-test.c added into Makefile

23 years agoOops, endianity problem in reference files.
Martin Mares [Mon, 19 Mar 2001 19:55:49 +0000 (19:55 +0000)]
Oops, endianity problem in reference files.

23 years agoBetter setproctitle() inspired by sendmail's one.
Martin Mares [Sat, 17 Mar 2001 15:03:47 +0000 (15:03 +0000)]
Better setproctitle() inspired by sendmail's one.

23 years agoDefine setproctitle() and use it for gatherer thread status reporting.
Martin Mares [Sat, 17 Mar 2001 14:42:16 +0000 (14:42 +0000)]
Define setproctitle() and use it for gatherer thread status reporting.

23 years agoMoved generic heap macros to heap.h.
Martin Mares [Fri, 16 Mar 2001 22:04:59 +0000 (22:04 +0000)]
Moved generic heap macros to heap.h.

23 years agoChanged locking mechanism of the bucket library to fcntl() instead
Martin Mares [Thu, 15 Mar 2001 22:23:43 +0000 (22:23 +0000)]
Changed locking mechanism of the bucket library to fcntl() instead
of flock() as the flock locks have totally broken semantics -- they
happily permit multiple locks on a shared fd!

23 years agoUse sh_ftruncate() instead of ftruncate().
Martin Mares [Thu, 15 Mar 2001 22:22:34 +0000 (22:22 +0000)]
Use sh_ftruncate() instead of ftruncate().

23 years agoBuckettool converted to standard command line parsing,
Martin Mares [Thu, 15 Mar 2001 20:48:33 +0000 (20:48 +0000)]
Buckettool converted to standard command line parsing,
so that config files work there as well.

23 years agoBucket file name and I/O buffer size are no longer hard-wired.
Martin Mares [Thu, 15 Mar 2001 20:11:34 +0000 (20:11 +0000)]
Bucket file name and I/O buffer size are no longer hard-wired.

Audited locking, but found no bugs.

23 years agoAdded macro for fetching of u32's aligned on 2-byte boundary.
Martin Mares [Sat, 10 Mar 2001 16:09:13 +0000 (16:09 +0000)]
Added macro for fetching of u32's aligned on 2-byte boundary.

23 years agoSome typecasts due to LFS.
Martin Mares [Fri, 9 Mar 2001 12:43:20 +0000 (12:43 +0000)]
Some typecasts due to LFS.

23 years agoCreated a new include for efficient access to unaligned data.
Martin Mares [Wed, 7 Mar 2001 13:38:19 +0000 (13:38 +0000)]
Created a new include for efficient access to unaligned data.
Updated fastbuf to use it.
Removed sh_foff_t -- all modules should use sh_off_t instead (the only
  case where it would differ is the 2G--4G range of file sizes you get
  when LFS is turned on and LARGE_DB turned off and it's not interesting
  anyway)
bgeto/bgetp selection is done in config.h.
LFS is turned on by default.

23 years agoRemoved old incarnation of CLAMP.
Martin Mares [Wed, 7 Mar 2001 13:35:35 +0000 (13:35 +0000)]
Removed old incarnation of CLAMP.

23 years agoAdded a couple of useful macros: MIN, MAX, CLAMP, ARRAY_SIZE.
Martin Mares [Wed, 7 Mar 2001 13:34:58 +0000 (13:34 +0000)]
Added a couple of useful macros: MIN, MAX, CLAMP, ARRAY_SIZE.

Removed NULL as it's already defined in config.h.

23 years agoAdded sh_mmap().
Martin Mares [Wed, 7 Mar 2001 13:33:37 +0000 (13:33 +0000)]
Added sh_mmap().

For now, only the "use glibc" interface works, the others miss several
essential functions (namely sh_ftruncate and sh_mmap).

23 years agoopen -> sh_open
Martin Mares [Wed, 7 Mar 2001 13:27:17 +0000 (13:27 +0000)]
open -> sh_open

23 years agoUse sh_open instead of open.
Martin Mares [Wed, 7 Mar 2001 13:24:00 +0000 (13:24 +0000)]
Use sh_open instead of open.

23 years agoDon't crash when the last fragment is of zero length.
Martin Mares [Wed, 7 Mar 2001 12:34:14 +0000 (12:34 +0000)]
Don't crash when the last fragment is of zero length.

No more SEGV's on truncated files.

23 years agoThe StringMap file contains an end marker pointing past the last
Martin Mares [Tue, 6 Mar 2001 17:42:17 +0000 (17:42 +0000)]
The StringMap file contains an end marker pointing past the last
string reference.

23 years agoIntroduced obuck_get_pos(), converted gatherd limits to use it.
Martin Mares [Mon, 5 Mar 2001 12:35:58 +0000 (12:35 +0000)]
Introduced obuck_get_pos(), converted gatherd limits to use it.

23 years agobugfix
Robert Spalek [Mon, 5 Mar 2001 12:26:54 +0000 (12:26 +0000)]
bugfix

23 years agoRemoved cf_read() and cf_default_*().
Martin Mares [Mon, 5 Mar 2001 11:52:14 +0000 (11:52 +0000)]
Removed cf_read() and cf_default_*().

23 years agoobuck_size() deleted
Robert Spalek [Mon, 5 Mar 2001 09:37:32 +0000 (09:37 +0000)]
obuck_size() deleted

23 years agocf_default_init() replaced by direct access to cfdeffile, the default value
Robert Spalek [Mon, 5 Mar 2001 09:02:11 +0000 (09:02 +0000)]
cf_default_init() replaced by direct access to cfdeffile, the default value
is DEFAULT_CONFIG
cf_default_done() replaced by automatical call if getopt() returns -1

23 years agocf_default_{init,done} interface used instead cf_read
Robert Spalek [Sun, 4 Mar 2001 15:21:50 +0000 (15:21 +0000)]
cf_default_{init,done} interface used instead cf_read

23 years agoadded cf_default_{init,done} for setting the default config filename, that
Robert Spalek [Sun, 4 Mar 2001 15:19:26 +0000 (15:19 +0000)]
added cf_default_{init,done} for setting the default config filename, that
will be automatically readed if not overriden by command-line option

23 years agoadded obuck_size() returning the size of bucket file (for gatherd.c stopping
Robert Spalek [Sun, 4 Mar 2001 15:18:08 +0000 (15:18 +0000)]
added obuck_size() returning the size of bucket file (for gatherd.c stopping
after maximal size is reached)

23 years agoMoved fp_hash() to index.h.
Martin Mares [Sat, 3 Mar 2001 22:50:29 +0000 (22:50 +0000)]
Moved fp_hash() to index.h.

23 years agoDefine FASTBUF_BYTES_PER_(O|P).
Martin Mares [Sat, 3 Mar 2001 17:22:16 +0000 (17:22 +0000)]
Define FASTBUF_BYTES_PER_(O|P).

23 years agoAdded indexer names for word and string type classes.
Martin Mares [Sat, 3 Mar 2001 13:39:36 +0000 (13:39 +0000)]
Added indexer names for word and string type classes.

23 years agoDefined which string classes contain URL's and which ones case insensitive
Martin Mares [Sat, 3 Mar 2001 12:12:35 +0000 (12:12 +0000)]
Defined which string classes contain URL's and which ones case insensitive
strings.

23 years agoadded
Robert Spalek [Fri, 2 Mar 2001 13:36:47 +0000 (13:36 +0000)]
added

23 years agoUpdated the charset conversion library to UniCode 3.0.
Martin Mares [Fri, 2 Mar 2001 11:30:11 +0000 (11:30 +0000)]
Updated the charset conversion library to UniCode 3.0.

Removed ice age relics.

Removed signature tables as they are not used anyway.

23 years agoReplaced <sys/time.h> by <time.h> where appropriate.
Martin Mares [Fri, 2 Mar 2001 11:00:33 +0000 (11:00 +0000)]
Replaced <sys/time.h> by <time.h> where appropriate.

23 years agoFixed bug in generating UTF-8 for codes >= 0x800.
Martin Mares [Thu, 1 Mar 2001 17:31:47 +0000 (17:31 +0000)]
Fixed bug in generating UTF-8 for codes >= 0x800.

23 years agoDefined a GET_TAGGED_CHAR macro to read our internal representation
Martin Mares [Thu, 1 Mar 2001 16:57:58 +0000 (16:57 +0000)]
Defined a GET_TAGGED_CHAR macro to read our internal representation
of tagged text, mapping the tags to character codes >= 0x80000000.

23 years agoGenerate index cards.
Martin Mares [Fri, 23 Feb 2001 14:02:46 +0000 (14:02 +0000)]
Generate index cards.

The chewer is complete, but it will probably need a bit of optimization
and fine-tuning when we get some real data. The searching for words present
in the cache doesn't look well.

23 years agoIndexing of strings.
Martin Mares [Fri, 23 Feb 2001 10:53:32 +0000 (10:53 +0000)]
Indexing of strings.

23 years agoSome more chewer work...
Martin Mares [Thu, 22 Feb 2001 16:01:49 +0000 (16:01 +0000)]
Some more chewer work...

23 years agoAdded bgets0().
Martin Mares [Tue, 20 Feb 2001 22:32:06 +0000 (22:32 +0000)]
Added bgets0().

23 years agoAdded a useful macro for value clamping.
Martin Mares [Tue, 20 Feb 2001 17:51:24 +0000 (17:51 +0000)]
Added a useful macro for value clamping.

23 years agoOops, breadb() was wrong.
Martin Mares [Mon, 19 Feb 2001 19:08:56 +0000 (19:08 +0000)]
Oops, breadb() was wrong.

23 years agoAdded breadb() which acts just like bread(), but die()s if a partial
Martin Mares [Mon, 19 Feb 2001 18:53:46 +0000 (18:53 +0000)]
Added breadb() which acts just like bread(), but die()s if a partial
record is read. This is mainly to avoid consistency checks in main
code path.

23 years agoSORT_DELETE_INPUT works even with SORT_INPUT_FB.
Martin Mares [Fri, 16 Feb 2001 20:16:51 +0000 (20:16 +0000)]
SORT_DELETE_INPUT works even with SORT_INPUT_FB.

23 years agoDeclare fingerprints as 12 bytes, not 3 u32's.
Martin Mares [Fri, 16 Feb 2001 20:16:26 +0000 (20:16 +0000)]
Declare fingerprints as 12 bytes, not 3 u32's.

23 years agoAdded #define PACKED __attribute__((packed)).
Martin Mares [Fri, 16 Feb 2001 20:16:00 +0000 (20:16 +0000)]
Added #define PACKED __attribute__((packed)).

23 years agoAdded merger.
Martin Mares [Fri, 16 Feb 2001 18:54:31 +0000 (18:54 +0000)]
Added merger.

23 years agoAdded unmapping and writeable mappings.
Martin Mares [Fri, 16 Feb 2001 17:56:06 +0000 (17:56 +0000)]
Added unmapping and writeable mappings.

23 years agoScanner improvements: create redirect backlinks, detect empty documents,
Martin Mares [Fri, 16 Feb 2001 16:16:25 +0000 (16:16 +0000)]
Scanner improvements: create redirect backlinks, detect empty documents,
mark accented documents.

23 years agoTesting programs are not build by default.
Martin Mares [Thu, 15 Feb 2001 19:17:47 +0000 (19:17 +0000)]
Testing programs are not build by default.

23 years agoAdded URL fingerprints.
Martin Mares [Thu, 15 Feb 2001 19:05:39 +0000 (19:05 +0000)]
Added URL fingerprints.

23 years agoAdded bputs0() -- put a null-terminated string.
Martin Mares [Thu, 15 Feb 2001 19:04:57 +0000 (19:04 +0000)]
Added bputs0() -- put a null-terminated string.

23 years agoShut up warnings.
Martin Mares [Sat, 10 Feb 2001 12:28:22 +0000 (12:28 +0000)]
Shut up warnings.

23 years agoadded cf_item_count()
Robert Spalek [Fri, 9 Feb 2001 10:48:55 +0000 (10:48 +0000)]
added cf_item_count()

23 years agodeleted unused variable prog
Robert Spalek [Fri, 9 Feb 2001 10:48:17 +0000 (10:48 +0000)]
deleted unused variable prog

23 years agoadded sort-test
Robert Spalek [Fri, 9 Feb 2001 10:48:02 +0000 (10:48 +0000)]
added sort-test

23 years agoNext version of the sorter -- both presorting and unifying works.
Martin Mares [Sun, 4 Feb 2001 20:08:14 +0000 (20:08 +0000)]
Next version of the sorter -- both presorting and unifying works.

sort-test now does just `sort -u' and it's about 30% slower than its
GNU counterpart, probably due to extra copies of sorting keys by our
buffered I/O layer. Fortunately, a typical case will be long data with
short keys where we should be efficient as we can use bbcopy().

23 years agocf_get_item added
Robert Spalek [Sun, 4 Feb 2001 15:29:15 +0000 (15:29 +0000)]
cf_get_item added

23 years agoFirst version of the sorter. No presorting phase yet.
Martin Mares [Sun, 4 Feb 2001 14:44:54 +0000 (14:44 +0000)]
First version of the sorter. No presorting phase yet.

23 years agoAdded "is_temp_file" attribute which causes automatic deletion of the
Martin Mares [Sun, 4 Feb 2001 14:44:00 +0000 (14:44 +0000)]
Added "is_temp_file" attribute which causes automatic deletion of the
file upon bclose().

23 years agoDon't log pid until log_fork() is called.
Martin Mares [Sun, 28 Jan 2001 21:40:27 +0000 (21:40 +0000)]
Don't log pid until log_fork() is called.

23 years agoAdded __attribute__((format...)) to declaration of log(), so that
Martin Mares [Sun, 28 Jan 2001 20:50:23 +0000 (20:50 +0000)]
Added __attribute__((format...)) to declaration of log(), so that
discrepancies between format string and arguments get easily found.

23 years agooid's above OBUCK_OID_FIRST_SPECIAL are reserved for encoding of
Martin Mares [Fri, 26 Jan 2001 16:28:09 +0000 (16:28 +0000)]
oid's above OBUCK_OID_FIRST_SPECIAL are reserved for encoding of
error codes and other stuff.

23 years agoobj_add_attr() now returns head of the attribute value chain, so that you
Martin Mares [Fri, 26 Jan 2001 16:27:38 +0000 (16:27 +0000)]
obj_add_attr() now returns head of the attribute value chain, so that you
can easily do:

struct oattr *a;
a = obj_add_attr(obj, NULL, 'x', "val1");
a = obj_add_attr(obj, a, 'x', "val2");

to insert multi-valued attributes quickly.

23 years agoAdded log_fork() which should be called after fork()ing to invalidate cached PID.
Martin Mares [Fri, 26 Jan 2001 16:25:49 +0000 (16:25 +0000)]
Added log_fork() which should be called after fork()ing to invalidate cached PID.

23 years agoUse xmalloc_zero().
Martin Mares [Thu, 25 Jan 2001 16:01:53 +0000 (16:01 +0000)]
Use xmalloc_zero().

23 years agoNewlines and carriage returns are considered blanks. Carefully checked
Martin Mares [Thu, 25 Jan 2001 14:51:47 +0000 (14:51 +0000)]
Newlines and carriage returns are considered blanks. Carefully checked
all modules using Cspace() and Cblank().

23 years agoWord type 0 is reserved.
Martin Mares [Tue, 23 Jan 2001 14:59:36 +0000 (14:59 +0000)]
Word type 0 is reserved.

23 years agodeclaration cosmetic fix
Robert Spalek [Sun, 21 Jan 2001 20:00:58 +0000 (20:00 +0000)]
declaration cosmetic fix

23 years agoHTML parser basically works. A *lot* of things still needs to be cleaned up.
Martin Mares [Sun, 21 Jan 2001 19:48:34 +0000 (19:48 +0000)]
HTML parser basically works. A *lot* of things still needs to be cleaned up.

23 years agomemory pool for everything about configuration added
Robert Spalek [Sun, 21 Jan 2001 19:14:46 +0000 (19:14 +0000)]
memory pool for everything about configuration added
value of CT_FUNCTION callback is now cfg_stralloc()'ed

23 years agoAdded the genhash utility (simple gperf replacement).
Martin Mares [Sun, 21 Jan 2001 18:20:23 +0000 (18:20 +0000)]
Added the genhash utility (simple gperf replacement).

23 years agoAdded functions for reading/writing UTF-8 characters on fastbuf streams.
Martin Mares [Sun, 21 Jan 2001 17:59:58 +0000 (17:59 +0000)]
Added functions for reading/writing UTF-8 characters on fastbuf streams.

23 years agoCreated a header which will contain description of all data structures
Martin Mares [Sun, 21 Jan 2001 17:59:33 +0000 (17:59 +0000)]
Created a header which will contain description of all data structures
used in the indices. For now, there are word categories.

23 years agoAdded "direct buffer I/O" interface for those who want to avoid an extra
Martin Mares [Sun, 21 Jan 2001 14:14:04 +0000 (14:14 +0000)]
Added "direct buffer I/O" interface for those who want to avoid an extra
copy of data during read/write at expense of having to be prepared for
any data size the buffering layer tells them to read/write.

23 years agoUse xmalloc_zero() instead of xmalloc() followed by bzero().
Martin Mares [Sun, 21 Jan 2001 11:10:45 +0000 (11:10 +0000)]
Use xmalloc_zero() instead of xmalloc() followed by bzero().

23 years agoUse xmalloc_zero() instead of xmalloc(), thus squashing several uninitialized
Martin Mares [Sun, 21 Jan 2001 11:09:42 +0000 (11:09 +0000)]
Use xmalloc_zero() instead of xmalloc(), thus squashing several uninitialized
structure field bugs. Also simplified the code a lot.

23 years agoIntroduced mp_alloc_zero().
Martin Mares [Sun, 21 Jan 2001 11:07:58 +0000 (11:07 +0000)]
Introduced mp_alloc_zero().

23 years agoIntroduced xmalloc_zero().
Martin Mares [Sun, 21 Jan 2001 11:07:40 +0000 (11:07 +0000)]
Introduced xmalloc_zero().

23 years agoExported conversions between internal character codes and UCS.
Martin Mares [Wed, 17 Jan 2001 13:10:28 +0000 (13:10 +0000)]
Exported conversions between internal character codes and UCS.

23 years agoFixed a couple of bugs.
Martin Mares [Mon, 15 Jan 2001 11:30:45 +0000 (11:30 +0000)]
Fixed a couple of bugs.

23 years agorewritten, enhanced, updated, fixed
Robert Spalek [Mon, 15 Jan 2001 10:11:05 +0000 (10:11 +0000)]
rewritten, enhanced, updated, fixed

23 years agoAdded an explanatory comment.
Martin Mares [Mon, 15 Jan 2001 09:36:26 +0000 (09:36 +0000)]
Added an explanatory comment.

23 years agoGuards, guards!
Martin Mares [Sun, 14 Jan 2001 21:46:09 +0000 (21:46 +0000)]
Guards, guards!

23 years agoRemember to link object.o to the library. :)
Martin Mares [Sun, 14 Jan 2001 20:50:13 +0000 (20:50 +0000)]
Remember to link object.o to the library. :)

23 years agoAdd sh_time_t type.
Martin Mares [Sun, 14 Jan 2001 20:50:00 +0000 (20:50 +0000)]
Add sh_time_t type.

23 years agoAllow bclose(NULL).
Martin Mares [Sun, 14 Jan 2001 20:49:49 +0000 (20:49 +0000)]
Allow bclose(NULL).

23 years agoGuard against multiple inclusion.
Martin Mares [Sun, 14 Jan 2001 20:49:41 +0000 (20:49 +0000)]
Guard against multiple inclusion.

23 years agoImport object attribute handling from old Sherlock.
Martin Mares [Sun, 14 Jan 2001 20:49:29 +0000 (20:49 +0000)]
Import object attribute handling from old Sherlock.