================================================================================ Duplicate Mail Checker @VERSION@ (c) 2006 Martin Mares ================================================================================ This package tries to solve two problems which were plaguing my mail-box for a long time: (1) In many cases, a single message arrives in several copies through different paths. The copies usually share a single Message-ID, so I decided to scan incoming mail for duplicate Message-ID's (for this purpose, I am willing to believe that SHA-1 is a perfect hash function). (2) Copies of messages originated by me frequently loop back to me. This could be handled by recording Message-ID's of outgoing mail, thus reducing it to the previous problem, but unfortunately this is not applicable to my situation as I send mail from multiple machines (not all of them on-line all the time). However, there is an easier solution: generate special Message-ID's for all my outgoing mail, so it can be easily recognized. (It's tempting to use the From field for such purposes, but that has its own problems -- for example, I want to exclude all mail generated by scripts and sent through other ways than my MUA, since such mail is not recorded to the outbox when sending.) For this purpose, I use Message-ID: , where USER@DOMAIN is my mail address and HOST is the machine I've sent the mail from (to avoid potential duplicates when sending from multiple machines). So I have written the `mdup' utility which receives mail headers on its standard input, parses them, records a SHA-1 hash of the Message-ID in a database and prints a verdict to the standard output: DUP Message-ID already known LOCAL Message-ID recognized as originated by me OK Message-ID not recognized NO ID The message has no ID ERROR Some other error occured (things like the USER and DOMAIN for detection of local mail or lifetime of all database records get passed as command-line options). This can be easily used in procmail rules: DUPS=`bin/mdup -lmj@ucw.cz` :0 * DUPS ?? LOCAL Mail/looped_back :0 * DUPS ?? DUP Mail/duplicates The only remaining problem is how to set the Message-ID for outgoing mail. I have written a sendmail wrapper (smail.c) for this purpose, but then I realized that this way the Message-ID's don't get recorded properly in the sent mail folder, breaking all threading. However, if you use Mutt, there is a neat (but a little devilish) trick how to generate Message-ID's using just the Mutt's configuration language: send-hook .* my_hdr Message-Id: (Note the backslashes -- they cause the backticks to be interpolated when the hook gets executed instead of when it gets parsed.) Please send all bug reports and suggestions to mj@ucw.cz. Have fun Martin LICENSE ~~~~~~~ This program can be distributed and used according to the terms of the GNU General Public License version 2 or newer. CAVEATS ~~~~~~~ `mdup' assumes that the ID hash database (by default located in ~/.mdup.db) can be locked by flock(), so if your home directory is shared over NFS, strange things might (and probably will) happen. Better don't do that.