how to trap utf-8 encoded Subject in spamassassin

Posted on 2014-07-11
Medium Priority
Last Modified: 2014-08-25
I'm trying to catch utf-8 encoded header messages in spamassassin. I have the following in /etc/mail/spamassassin/local/cf:
header LOCAL_UTF_SUBJECT        Subject:raw =~ /^=?utf-8/i
describe LOCAL_UTF_SUBJECT Subject line is utf encoded

Open in new window

I'm sending a test message with the subject "=?utf-8?Q?hello?=" using `telnet <host> 25` to insure the subject does not get escaped (it does not). Yet my rule does not see this subject content. My rule must be wrong. Any ideas why?
Question by:jmarkfoley
  • 2
  • 2
LVL 27

Expert Comment

ID: 40242581
you should remove the ^

it is a bad idea to have it because it is perfectly legit to encode only part of the subject as utf8

additionally, i'm not 100% sure (havn't written an SA rule in years), but i recollect header rules are passed the whole header and not just the value so /subject:\s*=?utf-8/i would work for your test email but it is simpler to just remove the ^ if you don't need to match only subjects that start with utf-8

btw i'm unsure that rule is such a good idea (and 2 is quite some score)

Author Comment

ID: 40256751
skullnobrains: > it is a bad idea to have it because it is perfectly legit to encode only part of the subject as utf8

This may be true, but we have *only* received utf8 encoded subjects from people trying to sell things. Besides, it just marks it as spam so the users can look in their spam folders for false positives.

I believe I have filtered these out with:

Subject:raw =~ /(utf-8|Cp1252|iso-8859|Windows-1252)/I

Without the "raw" modifier, the subject is decoded first. I've tried the same thing for From:

From:raw =~ /UTF-8/I

any ideas?
LVL 27

Accepted Solution

skullnobrains earned 2000 total points
ID: 40258984
as far as the rule is concerned, if you're trying to filter only utf8 subjects, no problem, you know what you are doing.

your new rule does not have the ^ char so it is not docked which is why it works
i guess
would be a little better but it is quite unlikely that a valid mail subject would contain those strings

have you tried the syntax i gave ?
"/^subject:\s*=?utf-8/i" (with the forgotten carret)

i'm unsure that filtering all those charsets is very meaningful, but you know what you are doing. for the record outlook users with locales that handle accents are very likely to get filtered. depending on your usual traffic, this may or may not be a good idea

Author Comment

ID: 40283639
This appears to be working with

Subject:raw =~ /(utf-8|Cp1252|iso-8859|Windows-1252)/i


Featured Post

Automating Your MSP Business

The road to profitability.
Delivering superior services is key to ensuring customer satisfaction and the consequent long-term relationships that enable MSPs to lock in predictable, recurring revenue. What's the best way to deliver superior service? One word: automation.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

I have written articles previously comparing SARDU and YUMI.  I also included a couple of lines about Easy2boot (easy2boot.com).  I have now been using, and enjoying easy2boot as my sole multiboot utility for some years and realize that it deserves …
Often times it's very very easy to extend a volume on a Linux instance in AWS, but impossible to shrink it. I wanted to contribute to the experts-exchange community a way of providing a procedure that works on an AWS instance. It can also be used on…
Learn how to navigate the file tree with the shell. Use pwd to print the current working directory: Use ls to list a directory's contents: Use cd to change to a new directory: Use wildcards instead of typing out long directory names: Use ../ to move…
Connecting to an Amazon Linux EC2 Instance from Windows Using PuTTY.
Suggested Courses
Course of the Month16 days, 22 hours left to enroll

864 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question