Auto-Expiry of Bayes Database Tokens

SpamAssassin by default tries to expire the most infrequently-used and uncommon tokens from the Bayes database for you automatically, as a means of keeping the Bayes database healthy and preventing it from growing without end. Unfortunately the auto-expiry mechanism is "opportunistic", which means that SpamAssassin decides--at scan time--whether it should take time out to do a token expiry run. It uses a fairly complex algorithm to decide whether an expiry run is needed, based on the time since the last expiry, the number of tokens deleted during the last expiry run, the number of tokens currently in the database, and so forth. The problem is that since this is checked every time SpamAssassin scans an email, it's as likely to happen during peak traffic hours as non-peak hours. Thus, in the middle of the business day your mail system can slow down without warning just because the opportunistic auto-expiry mechanism decided the conditions were right.

In light of this serious drawback, it's hard, frankly, to imagine a situation in which this sort of opportunistic auto-expiry is desirable. It would seem to be much more efficient to do no token expiry at all during peak traffic hours, and then schedule a once-a-day expiry run at a time that you--as the mail administrator--know that traffic is typically very low. Fortunately this is very easy to do.

In your local.cf file, disable the auto-expiry mechanism:

bayes_auto_expire  0

In your amavis/maia user's crontab, schedule a daily expiry process:

# expire Bayes tokens daily at 03:15
15 3 * * *   sa-learn --sync --force-expire

Equivalently, you can use the maiadbtool script:

# expire Bayes tokens daily at 03:15
15 3 * * *   maiadbtool.pl --expire-bayes

Back to FAQ