Home Forums OS X Server and Client Discussion Mail Is Bayesian filtering turned on?

Viewing 11 posts - 31 through 41 (of 41 total)
  • Author
    Posts
  • #365237
    MacDave
    Participant

    To all the extremely helpful heavy hitters in this discussion: the following is an example of an ‘X-Spam-Status’ header that I’m getting in my incoming email:

    X-Spam-Status: No, hits=3.92 tagged_above=-999 required=4
    tests=DATE_IN_FUTURE_12_24, MIME_BASE64_NO_NAME, MIME_BASE64_TEXT

    My spam threshold for appending the subject line is 4, and as you can see in the header, this message was marked as 3.92. So obviously becasue the message was assigned a score, that means SpamAssassin is working, right?

    Now there’s nothing in my header that says ‘BAYES_xx’, but the original poster was concerned that he wasn’t seeing ‘BAYES_xx’ in his headers. If SA is working, doesn’t that mean bayesian filtering is working? What extra filtering benefits do the ‘BAYES_xx’ entries indicate?

    Thanks to any who can elucidate for me.

    #365241
    gw1500se
    Participant

    This is all very interesting but can some please point me to the documentation that explains how to get this thing to learn in the first place? I see lots of discussion about cron running a job to process junk mail for learning (and this thread about linking the correct directories) but nothing about where and how to put junk mail whereever it is supposed to go so that this learning process has something to process. Thanks.

    #365365
    TvE
    Participant

    Has any of you examined if this is working after applying 10.4.5?

    I am just about to enable it on my server – but after reading this thread I am not quite sure how to do it…

    #366570
    Anonymous
    Guest

    To automatically train the junk mail filter:

    1 Enable junk mail filtering.

    2 Create two local accounts: junkmail, and notjunkmail

    3 Use Workgroup Manager to enable them to receive mail.

    4 Instruct your mail users to “Redirect” junk mail messages which have not previously
    been tagged as junk mail to “junkmail@”.

    5 Instruct your mail users to “Redirect” real mail messages which were wrongly tagged as
    junk mail to “notjunkmail@”.

    6 Each day at 1 am, the junk mail filter will learn what is junk and what was mistaken for
    junk, but is not.

    7 Delete the messages in junkmail and notjunkmail’s accounts daily.

    #367213
    DeputyAdmin
    Participant

    I have it set up as described in the MailAdmin.pdf but not real sure that it is working. I have not done the symlinking, was hoping it was fixed in 10.4.7.

    Can anyone confirm or deny that it is still broken in 10.4.7 or .8?

    thanks,
    Eric H

    #367356
    Anonymous
    Guest

    Right then guys. I have a 10.4.7 server. I’ve setup the symbolic link as shown in this thread and ran the learn_junk_mail script manually as root. The next day, the Bayes filter is definiteyl working. More mails are being stopped than before doing this, but not enough!!
    Now I have been collecting a lot of junk getting through in the junkmail account, and now have over 1000. If I now run teh learn_junk_mail script, it takes an age and just finishes with no report or error. When I check the bayes files in /var/clamav/.spamassassin (which are the same as /var/amavis/.spamassassin) the dates have changed but size has not increased on the files.
    If however I run the manual sa learn script as root on the /var/spool/imap/user/junkmail folder, the files in /var/root/.spamassassin increase a hell of a lot. For instace the bayes_tok file is now twice the size of the one in clamav.

    How can I find out if the learning is working correctly??

    #367365
    OsX4me
    Participant

    Don’t run sa-learn as root, or , as you’ve seen, it stores the result in /var/root/.spamassassin
    which won’t be referenced by anything else.

    sudo -u clamav sa-learn –showdots –spam –mbox –no-sync /path/to/your/junkmail

    Apple’s script at /etc/mail/spamassassin/learn_junk_mail
    while needing to be invoked as root, does su to clamav:

    su – clamav -c “sa-learn –spam –no-sync ## etc.

    #367557
    Moofo
    Participant

    I cracked the code… !

    There is Lots of things broken in Tiger Server. Bayesian Filters will never work !

    Go there for fixes:

    http://wiki.apache.org/spamassassin/SpamAssassin_on_Mac_OS_X_Server

    BTW, the Bayesian filter will not work until you have at least 200 Messages Learned in the DB.

    On my machine, the learning script was never triggering !

    With the database problem, no wonder the bayesian doesn’t work….

    #367650
    filipp
    Participant

    OK, some more observations on this most interesting topic:
    servu:~ root# system_profiler | grep “System Version”
    System Version: Mac OS X Server 10.4.6 (8I127)
    The server is definitely learning and /var/clamav/.spamassassin is never used:
    servu:~ root# ls -lh /var/clamav/.spamassassin/ /var/amavis/.spamassassin/
    /var/amavis/.spamassassin/:
    total 31168
    -rw——- 1 clamav clamav 9M Nov 18 13:07 auto-whitelist
    -rw——- 1 clamav clamav 10K Nov 18 13:07 bayes_journal
    -rw——- 1 clamav clamav 2M Nov 18 12:54 bayes_seen
    -rw——- 1 clamav clamav 2M Nov 18 12:54 bayes_toks

    /var/clamav/.spamassassin/:
    total 752
    -rw——- 1 clamav clamav 36K Jul 28 01:21 bayes_seen
    -rw——- 1 clamav clamav 340K Jul 28 01:21 bayes_toks

    However, a good portion of junk is still being caught and tagged:
    X-Sieve: CMU Sieve 2.2
    X-Virus-Scanned: by amavisd-new at mac.ee
    X-Spam-Status: Yes, hits=6.586 tagged_above=-999 required=3 tests=BAYES_50, DRUGS_ERECTILE, HELO_DYNAMIC_IPADDR, HTML_70_80, HTML_MESSAGE, MIME_QP_LONG_LINE
    X-Spam-Level: ******
    X-Spam-Flag: YES

    I’ll try the symlink tonight and report back.

    #367997
    stewsharpe
    Participant

    Just though you guys might be intrested, installed 10.4.7 (clean install) upgraded to 10.4.8 and SA working perfectly!!!!

    Stew Sharpe

    #368368
    filipp
    Participant

    Looks like Apple picked this one up as well:
    [url=http://docs.info.apple.com/article.html?artnum=305092]Automatically trained SpamAssassin databases are unused[/url]

Viewing 11 posts - 31 through 41 (of 41 total)
  • You must be logged in to reply to this topic.

Comments are closed