[Okta Webinar] Learn how to a build a cloud-first strategyRegister Now

x
?
Solved

spamassassin and Bayes

Posted on 2014-08-24
11
Medium Priority
?
240 Views
Last Modified: 2014-09-26
I am in the process of learning about Spamassassin and Bayes. I have been doing some reading a lot of the internet, but I am not sure I am grasping something here. If someone could help me understand, that would be great.

first, I know there are many reasons why a message could or could not get trapped by a spam filter. However, I am trying to this one part out. If a clearly spam message has Bayes_00 in the header, (which means Bayes thinks the message is ham)  does this mean that at one point, spamassassin/Bayes learned this message as ham either via some automatic means or by someone running sa-learn --ham on the spam message (for some unknown reason) ... If so, is there any way to know for sure it was learned as ham by Bayes? Or is that what Bayes_00 means?

Is it possible to have Bayes_00 in the header, but the message was never learned as either ham or spam? It just has Bayes_00 due to other reasons?  I am not sure if I am understanding this right or even verbalizing it right. I appreciate any help in understanding.
0
Comment
Question by:camstutz
  • 6
  • 3
  • 2
11 Comments
 
LVL 84

Accepted Solution

by:
David Johnson, CD, MVP earned 2000 total points
ID: 40282495
It can mean either that the user has told it that it is ham OR that the message doesn't fulfill the rules to classify it as spam.
0
 

Author Comment

by:camstutz
ID: 40282525
Hello David,

Thank you for answering, I'm glad I am not completely crazy, I did think that might be one possibility. If I may, I have a few more questions.

Is there any way to know for sure which was the case? Does it really matter? It seems as if the answer is the same either way: learn it as spam.

Note  We do have other scoring in place (AWL, razor, etc) I've read that if Bayes becomes heavily weighted toward spam or ham (via it's learned tokens - seen via sa-learn -dump magic) that it could produce this outcome as well? would a difference of 3000 (favoring ham) mean that the database is off?

is it possible to create a query with sa-learn or spamassassin to view if and when the message was learned as ham, or if it was just not enough points scored? If you use sa-learn --ham or --spam it would just relearn the message as one or the other.
0
 
LVL 13

Expert Comment

by:Sandy
ID: 40282533
I would suggest you to try MailScanner which can clarify you like on what basis msg has been declared as what (ham/spam).  It has a web panel where you can clearly see the score of mail which can lead to further classification of message.

Ty/SA
0
VIDEO: THE CONCERTO CLOUD FOR HEALTHCARE

Modern healthcare requires a modern cloud. View this brief video to understand how the Concerto Cloud for Healthcare can help your organization.

 

Author Comment

by:camstutz
ID: 40282851
Thanks for the information Sandy, unfortunately, it is not an option at this point, However, I do have the ability to read through the message headers (which in includes spam assassin scoring for our clients emails. I neglected to mention that we run email servers for our clients. However, at the moment, I don't think implementing MailScanner is possible.

I appreciate the suggestion though.
0
 
LVL 13

Expert Comment

by:Sandy
ID: 40282853
ok.. but even if you read the message header body it also have spam score indication if razor/fizer/dcc are in-place and configured.

Ty/SA
0
 

Author Comment

by:camstutz
ID: 40282872
I do see that, not every messsage has a razor indication. However, using our current implementation of spamassasssin (mail scanner looks like a replacement), I was just wondering if there were to answer a few more the questions I have :)
Pretty much, I get that if it has Bayes_00, the best course of action is to learn it as spam - regardless of the before-mentioned reasoning. However, I guess I wanted to drill into mail filtering a bit more and figure out if Bayes truly learned this spam as ham once. Oh well, I guess I spend a few moments with spamc / spamassassin man pages. Maybe it isn't possible either.
0
 
LVL 13

Expert Comment

by:Sandy
ID: 40282905
AFAIK about Bayes_00 it sense the message as spam when the spam score comes more than 20 which is the default nature hence it is always suggested to configure multiple rules with own defination to define. This interface is intellegent but somehow this also needs little more fine tune to understand the exact spam. In corporate mails it is bit tough for the system to judge based on bayes_00.

Not sure check if it has auto generated mail id's like :ham@domain and spam@domain (like in Zimbra) which are being used for manual feeding of spamfilters to understand which is spam or which is ham.

Hope m clear here. :)

TY/SA
0
 

Author Comment

by:camstutz
ID: 40284662
Thanks for the comment Sandy,
0
 

Author Comment

by:camstutz
ID: 40307786
I'm reading that the default for ham autolearning is .1, however, I notice that many spam are getting autolearned as ham with a .8 score. How do I check to see what is wrong?
0
 

Author Comment

by:camstutz
ID: 40334361
David: thanks for your answer, I do receive a lot of spam email that doesn't hit any scoring and says bayes_00. is there any way to deal with this? I appreciate your previous answer, and the only thing I can think of is to learn it as spam for them or clear the bayes db for them.

I understand in theory, if bayes was not available at the time, and for some reason the net tests were also unavailable ... but to not even match local rules? that seems possible, but unlikely. either something is wrong or misconfigured or something.
0
 
LVL 84

Expert Comment

by:David Johnson, CD, MVP
ID: 40334379
the more you use it the better it gets..
0

Featured Post

Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

The core idea of this article is to make you acquainted with the best way in which you can export Exchange mailbox to PST format.
Phishing emails are a popular malware delivery vehicle for attack.  While there are many ways for an attacker to increase the chances of success for their phishing emails, one of the most effective methods involves spoofing the message to appear to …
Learn how to get help with Linux/Unix bash shell commands. Use help to read help documents for built in bash shell commands.: Use man to interface with the online reference manuals for shell commands.: Use man to search man pages for unknown command…
In this Micro Video tutorial you will learn the basics about Database Availability Groups and How to configure one using a live Exchange Server Environment. The video tutorial explains the basics of the Exchange server Database Availability grou…
Suggested Courses
Course of the Month20 days, 4 hours left to enroll

872 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question