The Sherlock Network Search Engine
Sherlock is an universal extensible system for collecting documents
distributed across the network (e.g., on the World-Wide Web), indexing
them and offering full-text search capabilities.
The system is under development now and it's sometimes being run
for experimental purposes. Currently finished modules include:
- gatherd
- Information gathering supervisor (starts all download and analysis
modules of the system and controls object queue).
- httpget
- Downloads files via HTTP.
- fileget
- Processes locally accessible files.
- htmlchew
- Analyses HTML documents and extracts data from them.
- textchew
- The same for ASCII texts.
- gived
- Object information distribution daemon.
- objget
- Client for
gived
.
- dbuild
- Database builder for the search engine.
- sherlockd
- The full-text search engine.
- scgi
- WWW interface for the search engine.
New modules will probably appear soon.
The current version can be downloaded here
and sometimes runs here.
If you want to help with this projcet, suggest any extensions or report
bugs, feel free to contact the author.
Last modification 28. 7. 1997 by
Martin Mares