Bayes not working in spamassassin

I have the same version of spamassassin running on two different Slackware computer. I believe I have enabled the Bayesian classifier on both. One host does give me Bayes scores in the message header, for example:
X-Spam-Report:
        *  0.0 HTML_MESSAGE BODY: HTML included in message
        *  2.0 BAYES_50 BODY: Bayes spam probability is 40 to 60%
        *      [score: 0.5000]

Open in new window

The other host's spamassassin never does. I'm quite sure the recalcitrant host has been trained with plenty of spam and ham.

Any ideas why Bayes in not kicking in on this host? What can I check?
LVL 1
MarkAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

robocatCommented:
Do you get a X-Spam-Report at all?
0
MarkAuthor Commented:
robocat: > Do you get a X-Spam-Report at all?

Yes, here's a complete report from a recent message. Notice no mention of Bayes.
X-Spam-Status: No, score=1.3 required=5.0 tests=AWL,DATE_IN_PAST_03_06,
        HTML_MESSAGE,SPF_HELO_PASS,SPF_PASS,T_RP_MATCHES_RCVD autolearn=no
        version=3.3.2
X-Spam-Report:
        * -0.0 SPF_HELO_PASS SPF: HELO matches SPF record
        * -0.0 T_RP_MATCHES_RCVD Envelope sender domain matches handover relay
        *      domain
        * -0.0 SPF_PASS SPF: sender matches SPF record
        *  1.1 DATE_IN_PAST_03_06 Date: is 3 to 6 hours before Received: date
        *  0.0 HTML_MESSAGE BODY: HTML included in message
        *  0.2 AWL AWL: From: address is in the auto white-list
X-Spam-Level: *
X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on

Open in new window

0
robocatCommented:
Run

spamassassin -D --lint

and check for any messages involving "bayes"
0
Increase Security & Decrease Risk with NSPM Tools

Analyst firm, Enterprise Management Associates (EMA) reveals significant benefits to enterprises when using Network Security Policy Management (NSPM) solutions, while organizations without, experienced issues including non standard security policies and failed cloud migrations

MarkAuthor Commented:
Here are all the bayes related messages from the lint run on the non-working host:
Dec 23 10:50:05.770 [26285] dbg: plugin: loading Mail::SpamAssassin::Plugin::Bayes from @INC
Dec 23 10:50:05.893 [26285] dbg: config: fixed relative path: /var/lib/spamassassin/3.003002/updates_spamassassin_org/23_bayes.cf
Dec 23 10:50:05.893 [26285] dbg: config: using "/var/lib/spamassassin/3.003002/updates_spamassassin_org/23_bayes.cf" for included file
Dec 23 10:50:05.893 [26285] dbg: config: read file /var/lib/spamassassin/3.003002/updates_spamassassin_org/23_bayes.cf
Dec 23 10:50:07.189 [26285] dbg: plugin: Mail::SpamAssassin::Plugin::Bayes=HASH(0x97a7300) implements 'learner_new', priority 0
Dec 23 10:50:07.189 [26285] dbg: bayes: learner_new self=Mail::SpamAssassin::Plugin::Bayes=HASH(0x97a7300), bayes_store_module=Mail::SpamAssassin::BayesStore::DBM
Dec 23 10:50:07.204 [26285] dbg: bayes: learner_new: got store=Mail::SpamAssassin::BayesStore::DBM=HASH(0x9957380)
Dec 23 10:50:07.212 [26285] dbg: plugin: Mail::SpamAssassin::Plugin::Bayes=HASH(0x97a7300) implements 'learner_is_scan_available', priority 0
Dec 23 10:50:07.212 [26285] dbg: bayes: tie-ing to DB file R/O /root/.spamassassin/bayes_toks
Dec 23 10:50:07.212 [26285] dbg: bayes: tie-ing to DB file R/O /root/.spamassassin/bayes_seen
Dec 23 10:50:07.213 [26285] dbg: bayes: found bayes db version 3
Dec 23 10:50:07.213 [26285] dbg: bayes: DB journal sync: last sync: 0
Dec 23 10:50:07.244 [26285] dbg: bayes: DB journal sync: last sync: 0
Dec 23 10:50:07.244 [26285] dbg: bayes: corpus size: nspam = 679, nham = 1212
Dec 23 10:50:07.269 [26285] dbg: bayes: score = 0.346512389874618
Dec 23 10:50:07.269 [26285] dbg: bayes: DB expiry: tokens in DB: 113590, Expiry max size: 150000, Oldest atime: 1405796976, Newest atime: 1419005450, Last expire: 0, Current time: 1419349807
Dec 23 10:50:07.269 [26285] dbg: bayes: DB journal sync: last sync: 0
Dec 23 10:50:07.269 [26285] dbg: bayes: untie-ing
Dec 23 10:50:07.661 [26285] dbg: rules: ran eval rule BAYES_40 ======> got hit (1)
Dec 23 10:50:07.764 [26285] dbg: check: tests=BAYES_40,MISSING_DATE,MISSING_HEADERS,NO_RECEIVED,NO_RELAYS
Dec 23 10:50:07.764 [26285] dbg: timing: total 2019 ms - init: 1469 (72.8%), parse: 0.49 (0.0%), extract_message_metadata: 0.86 (0.0%), get_uri_detail_list: 0.78 (0.0%), tests_pri_-1000: 7 (0.3%), compile_gen: 107 (5.3%), compile_eval: 26 (1.3%), tests_pri_-950: 6 (0.3%), tests_pri_-900: 11 (0.6%), tests_pri_-400: 29 (1.4%), check_bayes: 26 (1.3%), tests_pri_0: 412 (20.4%), tests_pri_500: 78 (3.8%), tests_pri_1000: 3 (0.2%)

Open in new window

I don't see anything particularly suspicious. Here is the same lint command on the host where bayes is working:
Dec 23 11:06:39.168 [7123] dbg: plugin: loading Mail::SpamAssassin::Plugin::Bayes from @INC
Dec 23 11:06:39.276 [7123] dbg: config: fixed relative path: /var/lib/spamassassin/3.003002/updates_spamassassin_org/23_bayes.cf
Dec 23 11:06:39.276 [7123] dbg: config: using "/var/lib/spamassassin/3.003002/updates_spamassassin_org/23_bayes.cf" for included file
Dec 23 11:06:39.276 [7123] dbg: config: read file /var/lib/spamassassin/3.003002/updates_spamassassin_org/23_bayes.cf
Dec 23 11:06:40.001 [7123] dbg: plugin: Mail::SpamAssassin::Plugin::Bayes=HASH(0x97f5ef8) implements 'learner_new', priority 0
Dec 23 11:06:40.001 [7123] dbg: bayes: learner_new self=Mail::SpamAssassin::Plugin::Bayes=HASH(0x97f5ef8), bayes_store_module=Mail::SpamAssassin::BayesStore::DBM
Dec 23 11:06:40.009 [7123] dbg: bayes: learner_new: got store=Mail::SpamAssassin::BayesStore::DBM=HASH(0x9a471f0)
Dec 23 11:06:40.009 [7123] dbg: plugin: Mail::SpamAssassin::Plugin::Bayes=HASH(0x97f5ef8) implements 'learner_is_scan_available', priority 0
Dec 23 11:06:40.009 [7123] dbg: bayes: tie-ing to DB file R/O /root/.spamassassin/bayes_toks
Dec 23 11:06:40.010 [7123] dbg: bayes: tie-ing to DB file R/O /root/.spamassassin/bayes_seen
Dec 23 11:06:40.010 [7123] dbg: bayes: found bayes db version 3
Dec 23 11:06:40.010 [7123] dbg: bayes: opportunistic call attempt skipped, found fresh running expire magic token
Dec 23 11:06:40.025 [7123] dbg: bayes: opportunistic call attempt skipped, found fresh running expire magic token
Dec 23 11:06:40.025 [7123] dbg: bayes: corpus size: nspam = 3783, nham = 5179
Dec 23 11:06:40.045 [7123] dbg: bayes: score = 0.484370339736726
Dec 23 11:06:40.045 [7123] dbg: bayes: opportunistic call attempt skipped, found fresh running expire magic token
Dec 23 11:06:40.045 [7123] dbg: bayes: untie-ing
Dec 23 11:06:40.220 [7123] dbg: rules: ran eval rule BAYES_50 ======> got hit (1)
Dec 23 11:06:40.274 [7123] dbg: check: tests=BAYES_50,MISSING_DATE,MISSING_HEADERS,NO_RECEIVED,NO_RELAYS
Dec 23 11:06:40.275 [7123] dbg: timing: total 1132 ms - init: 868 (76.7%), parse: 0.50 (0.0%), extract_message_metadata: 0.66 (0.1%), get_uri_detail_list: 0.68 (0.1%), tests_pri_-1000: 4 (0.4%), compile_gen: 75 (6.6%), compile_eval: 13 (1.1%), tests_pri_-950: 3 (0.2%), tests_pri_-900: 3 (0.3%), tests_pri_-400: 24 (2.1%), check_bayes: 21 (1.8%), tests_pri_0: 180 (15.9%), tests_pri_500: 47 (4.2%)

Open in new window

This is a diff between the lint runs on the not working host (<) and the working host (>)
12,17c12,16
<  dbg: bayes: DB journal sync: last sync: 0
<  dbg: bayes: DB journal sync: last sync: 0
<  dbg: bayes: corpus size: nspam = 679, nham = 1212
<  dbg: bayes: score = 0.346512389874618
<  dbg: bayes: DB expiry: tokens in DB: 113590, Expiry max size: 150000, Oldest atime: 1405796976, Newest atime
: 1419005450, Last expire: 0, Current time: 1419349807
<  dbg: bayes: DB journal sync: last sync: 0
---
>  dbg: bayes: opportunistic call attempt skipped, found fresh running expire magic token
>  dbg: bayes: opportunistic call attempt skipped, found fresh running expire magic token
>  dbg: bayes: corpus size: nspam = 3783, nham = 5179
>  dbg: bayes: score = 0.484370339736726
>  dbg: bayes: opportunistic call attempt skipped, found fresh running expire magic token
19,21c18,20
<  dbg: rules: ran eval rule BAYES_40 ======> got hit (1)
<  dbg: check: tests=BAYES_40,MISSING_DATE,MISSING_HEADERS,NO_RECEIVED,NO_RELAYS
<  dbg: timing: total 2019 ms - init: 1469 (72.8%), parse: 0.49 (0.0%), extract_message_metadata: 0.86 (0.0%),
get_uri_detail_list: 0.78 (0.0%), tests_pri_-1000: 7 (0.3%), compile_gen: 107 (5.3%), compile_eval: 26 (1.3%),
tests_pri_-950: 6 (0.3%), tests_pri_-900: 11 (0.6%), tests_pri_-400: 29 (1.4%), check_bayes: 26 (1.3%), tests_p
ri_0: 412 (20.4%), tests_pri_500: 78 (3.8%), tests_pri_1000: 3 (0.2%)
---
>  dbg: rules: ran eval rule BAYES_50 ======> got hit (1)
>  dbg: check: tests=BAYES_50,MISSING_DATE,MISSING_HEADERS,NO_RECEIVED,NO_RELAYS
>  dbg: timing: total 1132 ms - init: 868 (76.7%), parse: 0.50 (0.0%), extract_message_metadata: 0.66 (0.1%), g
et_uri_detail_list: 0.68 (0.1%), tests_pri_-1000: 4 (0.4%), compile_gen: 75 (6.6%), compile_eval: 13 (1.1%), te
sts_pri_-950: 3 (0.2%), tests_pri_-900: 3 (0.3%), tests_pri_-400: 24 (2.1%), check_bayes: 21 (1.8%), tests_pri_
0: 180 (15.9%), tests_pri_500: 47 (4.2%)

Open in new window

The main differences I see are no "DB journal sync: last sync: 0" and "Last expire, 0" messages in the working host, and no "opportunistic call attempt skipped" in the not-working host.

Do these messages tell you anything?
0
btanExec ConsultantCommented:
I saw there is a "autolearn=no" in the X-Spam report.
If a message has already been learned by SpamAssassin, then that message will not be learned again. Therefore, if you run a message through SpamAssassin to see why it was classified as spam or ham, and it has already been learned, you will always get the result "autolearn=no". (To see this more clearly, use the "-D" flag, and you will see debug output explaining that the message has already been learned.)
https://wiki.apache.org/spamassassin/AutolearningNotWorking

i am thinking if it need to relearn
If you find that SA never seems to learn messages, try using sa-learn --dump magic to find out more about your database. The line "nham" is the number of ham messages SA has learned, and the line "nspam" is the number of spam messages SA has learned.
http://wiki.apache.org/spamassassin/BayesNotWorking

Both messages shared in the working and non-working difference are pertaining to Bayes expiration of its tokens learnt in the messages seen.
SpamAssassin can sync the journal and expire the DB tokens either manually or opportunistically. A journal sync is due if --sync is passed to sa-learn (manual), or if the following is true (opportunistic)
(see Expiration section and good to check out the "Getting started" and "Effective Learning") http://spamassassin.apache.org/full/3.3.x/doc/sa-learn.html
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
robocatCommented:
Can you run Spamassassin manually in debug mode?

spamassassin -D bayes <sometestspam.txt
0
MarkAuthor Commented:
sorry for the delay -- flu. Will test this afternoon
0
MarkAuthor Commented:
Well, it just started working all of a sudden! Perhaps I didn't have enough spam/ham in the database, but I was sure I did. I guess this one is solved.
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Linux

From novice to tech pro — start learning today.