how to trap utf-8 encoded Subject in spamassassin

Posted on 2014-07-11
Last Modified: 2014-08-25
I'm trying to catch utf-8 encoded header messages in spamassassin. I have the following in /etc/mail/spamassassin/local/cf:
header LOCAL_UTF_SUBJECT        Subject:raw =~ /^=?utf-8/i
describe LOCAL_UTF_SUBJECT Subject line is utf encoded

Open in new window

I'm sending a test message with the subject "=?utf-8?Q?hello?=" using `telnet <host> 25` to insure the subject does not get escaped (it does not). Yet my rule does not see this subject content. My rule must be wrong. Any ideas why?
Question by:jmarkfoley
    LVL 25

    Expert Comment

    you should remove the ^

    it is a bad idea to have it because it is perfectly legit to encode only part of the subject as utf8

    additionally, i'm not 100% sure (havn't written an SA rule in years), but i recollect header rules are passed the whole header and not just the value so /subject:\s*=?utf-8/i would work for your test email but it is simpler to just remove the ^ if you don't need to match only subjects that start with utf-8

    btw i'm unsure that rule is such a good idea (and 2 is quite some score)
    LVL 1

    Author Comment

    skullnobrains: > it is a bad idea to have it because it is perfectly legit to encode only part of the subject as utf8

    This may be true, but we have *only* received utf8 encoded subjects from people trying to sell things. Besides, it just marks it as spam so the users can look in their spam folders for false positives.

    I believe I have filtered these out with:

    Subject:raw =~ /(utf-8|Cp1252|iso-8859|Windows-1252)/I

    Without the "raw" modifier, the subject is decoded first. I've tried the same thing for From:

    From:raw =~ /UTF-8/I

    any ideas?
    LVL 25

    Accepted Solution

    as far as the rule is concerned, if you're trying to filter only utf8 subjects, no problem, you know what you are doing.

    your new rule does not have the ^ char so it is not docked which is why it works
    i guess
    would be a little better but it is quite unlikely that a valid mail subject would contain those strings

    have you tried the syntax i gave ?
    "/^subject:\s*=?utf-8/i" (with the forgotten carret)

    i'm unsure that filtering all those charsets is very meaningful, but you know what you are doing. for the record outlook users with locales that handle accents are very likely to get filtered. depending on your usual traffic, this may or may not be a good idea
    LVL 1

    Author Comment

    This appears to be working with

    Subject:raw =~ /(utf-8|Cp1252|iso-8859|Windows-1252)/i


    Write Comment

    Please enter a first name

    Please enter a last name

    We will never share this with anyone.

    Featured Post

    Looking for New Ways to Advertise?

    Engage with tech pros in our community with native advertising, as a Vendor Expert, and more.

    This is the error message I got (CODE) Error caused by incompatible libmp3lame 3.98-2 with ffmpeg I've googled this error message and found out sometimes it attaches this note "can be treated with downgrade libmp3lame to version 3.97 or 3.98" …
    Ransomware continues to be a growing problem for both personal and business users alike and Antivirus companies are still struggling to find a reliable way to protect you from this dangerous threat.
    Learn how to find files with the shell using the find and locate commands. Use locate to find a needle in a haystack.: With locate, check if the file still exists.: Use find to get the actual location of the file.:
    Learn how to navigate the file tree with the shell. Use pwd to print the current working directory: Use ls to list a directory's contents: Use cd to change to a new directory: Use wildcards instead of typing out long directory names: Use ../ to move…

    779 members asked questions and received personalized solutions in the past 7 days.

    Join the community of 500,000 technology professionals and ask your questions.

    Join & Ask a Question

    Need Help in Real-Time?

    Connect with top rated Experts

    12 Experts available now in Live!

    Get 1:1 Help Now