?
Solved

Spamassassin sa-learn

Posted on 2004-09-18
13
Medium Priority
?
449 Views
Last Modified: 2010-08-05
Hello All,

My question is sometime my own email consider as spam and sometime spams comes into my mailbox.

My question I heard about sa-learn but I am not too sure about that,

I've did man sa-learn but still I didnt understand.

it says something about ham and spam.

I am using Ensim box, and I've one email account name backup , in that folder all SPAM automatically goes to that email account.

So what exactly I have to do in order to protect my self from spam , I want spamassassin to know that only consider SPAM email as REAL SPAM not my clients nor my friends email.

Please kindly help me and let me know what's the procedure to AVOID SPAM and what's the procedure to deleiver NON-SPAM email to my INBOX.

Please guide me in detail as I am not a expert and I dont know too much about spamassassin.

Should I run the sa-learn like the following :-

sa-learn --showdots --mbox --spam backup
sa-learn --showdots --mbox --ham backup

If I do the above than sa-learn will automatically consider REALL SPAM email as SPAM or what ? If the above command is correct than how should I know, what I did is correct and what was the output of it.

Thank you
0
Comment
Question by:wcws
  • 5
  • 4
  • 3
  • +1
13 Comments
 
LVL 17

Expert Comment

by:owensleftfoot
ID: 12095618
To use sa-learn need two mboxes (or files) , one for spam that spamassassin missed & one for good emails which spamassassin has marked as spam. You forward all emails to the relevant accounts. Ie  forward spam that spammassissin missed to a mbox or file called spam. Forward all proper emails which spammassin mistakenly marked as spam to a mbox or file called ham. Then run sa-learn --showdots --mbox --spam spam and
sa-learn --showdots --mbox --ham  ham.
0
 

Author Comment

by:wcws
ID: 12097016
how can I setup my mailscanner to use an auto learn option ? in that way I dont have to do anything manually ?

Please advise.
0
 
LVL 17

Expert Comment

by:owensleftfoot
ID: 12097328
You cant. Spamassassin relies on you to tell it what is spam and what is not. You could set a cron job to have sa-learn parse your spam & ham mailboxes automatically but it will still be up to you to forward unidentified  spam to the spam mailbox and good emails mistakenly identified as spam to the ham mailbox.
0
NEW Veeam Agent for Microsoft Windows

Backup and recover physical and cloud-based servers and workstations, as well as endpoint devices that belong to remote users. Avoid downtime and data loss quickly and easily for Windows-based physical or public cloud-based workloads!

 

Author Comment

by:wcws
ID: 12097369
ok, so you are saying that I should create two email box

one : spam ( which is really 100% SPAM EMAILS )
second     : ham ( which is spam but MISSED by SPAMASSASSIn or GOOD EMAILS MARKED AS SPAM )

I am running a small server which is consisit of 40 customer meaning 40 domain names how I am going to know about there spam or ham emails.

Please advise
0
 
LVL 2

Expert Comment

by:dfk
ID: 12101891
Ah, ensim + mailscanner + spamassassin.

I suggest you read this document.  http://forums.ev1servers.net/showthread.php?s=942451ebe8112fa68a36020ce97e611b&threadid=20385

Best wishes
Mark Waterhouse
DFK Systems Limited
http://www.dfk-systems.com/
0
 

Author Comment

by:wcws
ID: 12102512
dfk : That link is a how to install MailScanner, there is nothing there explaining sa-learn.
0
 
LVL 2

Accepted Solution

by:
dfk earned 500 total points
ID: 12102650
wcws

You're right ;-)
I forgot what the initial question was as I read down the thread.


To answer your question regarding your virtual domains, there isnt an easy way.  Some people suggest having the users forward the 'spam' emails to a certain email address so as the system can autolearn these (via cron for example).  However, the problem here is with the MUA in question. Many MUA's (Outlook etc.) munge the headers and do not forward these on correctly.

If you allow spamassassin to use bayesian databases (in local.cf, use_bayes 1), you should start to notice a gradual decline in spam.  Our customers have quarantine directories setup and all 'suspected spam' messages go there.  It is then up to the user if they want to deliver that spam or not.

If you run 'sa-learn --dump magic', you will get an idea of how good your spam/ham bayesian classifier is.  Note: If the number of spam/ham messages is small (below 200), spamassassin - by default - will not modify your mails as the DB's are still learning.

You may also want to consider looking into other spam filters (dspam is very good - http://www.nuclearelephant.com/projects/dspam/).

Hope that better answers your question - apologies for not answering it first time!

Best wishes
Mark Waterhouse
DFK Systems Limited
http://www.dfk-systems.com/
0
 

Author Comment

by:wcws
ID: 12102819
I am running a ensim box so do you mean modify the local.cf in

/etc/mail/spamassassin/local.cf will that work globally every domain ?

Do you have your local.cf , can u paste it here so that I wil have an idea
0
 
LVL 2

Expert Comment

by:dfk
ID: 12103044
The local.cf file I have (on my ensim box) is indeed in /etc/mail/spamassassin/local.cf and contains my system whitelists (whitelist_from @dfk-systems.com).  However, depending on how you compiled spamassassin, the local.cf could be anywhere :-(

If you are still running MailScanner, all of this configuration is done in the MailScanner configuration files (perhaps in /opt/MailScanner/etc).  You can use the MailScanner rules to setup per-user/per-domain settings if required.

If your MailScanner.conf file has the following options set to Yes (or to a custom rule);
Virus Scanning = yes (or rule)
Spam Checks = yes (or rule)

then bayesian databases will be in operation as I dont believe MailScanner has an option to turn them off.

--Mark

0
 
LVL 4

Expert Comment

by:itcnbwise
ID: 12103434
FYI - I use spamassassin, and it's nice, but if you want to automatically remove incoming spam before users even get it, try using Vipul's Razor along with MailScanner - it works awesome: http://razor.sourceforge.net/
0
 
LVL 17

Expert Comment

by:owensleftfoot
ID: 12106840
"one : spam ( which is really 100% SPAM EMAILS )
second     : ham ( which is spam but MISSED by SPAMASSASSIn or GOOD EMAILS MARKED AS SPAM )"

I am running a small server which is consisit of 40 customer meaning 40 domain names how I am going to know about there spam or ham emails.

"second" is wrong - ham is only real emails marked as spam incorrectly by spamassissin.
As for your customers, I assume the server you are hosting them on has its own valid domainname? Just get your customers to forward spam missed by spammassissin to spam@yourdomainname and good emails marked as spam by spamassissin to  ham@yourdomainname Then run sa-learn with the --spam option on the spam mailbox & sa-learn --ham on the ham mailbox. (You can setup a cronjob to get sa-learn to run automatically.)
 
0
 

Author Comment

by:wcws
ID: 12107041
how can I setup a cron job automatically.

Please advise
0
 
LVL 2

Expert Comment

by:dfk
ID: 12107208
owensleftfoot - the problem with having users forward emails is the reliance on them actually forwarding useful information. If they neglect to send the email with the full headers, all you succeed in doing is adding some of the content and incorrect headers.....useless against RBLs.

-- Mark
0

Featured Post

What does it mean to be "Always On"?

Is your cloud always on? With an Always On cloud you won't have to worry about downtime for maintenance or software application code updates, ensuring that your bottom line isn't affected.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

The purpose of this article is to demonstrate how we can use conditional statements using Python.
This article will show you step-by-step instructions to build your own NTP CentOS server.  The network diagram shows the best practice to setup the NTP server farm for redundancy. ┬áThis article also serves as your NTP server documentation.
Learn how to find files with the shell using the find and locate commands. Use locate to find a needle in a haystack.: With locate, check if the file still exists.: Use find to get the actual location of the file.:
Connecting to an Amazon Linux EC2 Instance from Windows Using PuTTY.
Suggested Courses
Course of the Month16 days, 11 hours left to enroll

862 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question