|
Donations |
|
|
|
|
|
If you found this site helpful, please donate to help keep it online
Don't want to use PayPal? Try our physical address
|
|
|
Survey |
|
|
|
|
|
|
|
|
Translate |
|
|
|
|
|
|
|
|
|
|
View previous topic :: View next topic |
Author |
Message |
McAlly
Cadet
Joined: May 31, 2004
Posts: 8
Location: Belgium
|
Posted: Sat Jun 05, 2004 11:19 am Post subject: Change to Bayesian only |
|
|
Just to let you know.
As I read last night about Bayesian filter (A Plan for Spam : http://www.paulgraham.com/spam.html), I decided to use this (learning filter) as my only filter. Except 1 filter to mark every mail to delete ,coming from wanted source, but were I don't have the time to read them at this moment.
I cleared the white and black list and even deleted the learning.dat file (for a fresh start. I've got MWpro since a week, so no big lost)
No new address are added to the white/black list. Also no Spamcop or FA is used.
This was this morning at 8:00 am, now at 5:00 pm, I've got 28 emails
14 spam and 14 mail from Spamcop for confirmation
The last 5 spams were already recognized.
Not bad I guess.
I’ll keep you posted.
McAlly |
|
Back to top |
|
|
Al
Captain
Joined: May 08, 2002
Posts: 314
Location: Australia
|
Posted: Sun Jun 06, 2004 8:58 am Post subject: |
|
|
I did basically the same thing, some time ago, except that I kept my [Friends] list and one [Good] mail filter.
Baysian hasn't missed a beat, in over a month. |
|
Back to top |
|
|
McAlly
Cadet
Joined: May 31, 2004
Posts: 8
Location: Belgium
|
Posted: Sun Jun 06, 2004 12:19 pm Post subject: |
|
|
As I suspected, the first good mails I received were false positives, mark as possibly spam.
I send a email to someone i didn't knew his correct address, so I used some 4 different email-addresses and send a copy to myself.
From the mail-server(s) I got 2 mail back saying it was the wrong address and off course my own copy.
All 3 were marked a possibly spam.
MWpro also found a virus (a w32.Beagle.X@mm)
Up till now, I've got 110 email (72 spam)
Bayesian found 41 off them.
If you keep in mind it took 25 mail, before it marked the first spam.
I got 85 email (58 spam) with 41 found (70% found after day +1) |
|
Back to top |
|
|
rogerw
Major
Premium Member
Joined: May 11, 2003
Posts: 857
Location: USA
|
Posted: Sun Jun 06, 2004 12:31 pm Post subject: |
|
|
Bear in mind that several hundred emails need to be used in training for Bayesian to begin to make accurate suggestions.
You'll need to try to make sure that you have roughly an equal number of good and bad emails (or training corpus files of roughly the same size), or the weighting factors for words that good and bad emails have in common might tend to favor whichever type you have the most of.
One suggestion that might help make you happy with Bayesian is to stop training once you're getting acceptable results. Continuously adding to the training will tend to make the decisisions appear to 'drift': as more words/samples are added to the 'bad' corpus - the weighting of those words will change to favor more 'junk' classifications. Likewise with adding to the 'good' corpus.
However, the content of spam might change with time, so training should never be out of the question.
I suggest:
1) Train agressively to build up the database
2) When assignments are accurate to an acceptable level, discontinue training
3) Use Bayesian to classify only - but keep a watch on it's accuracy. With time (as mail content changes over time) the accuracy of assignment will change.
4) When the percentage of mis-assignments gets unacceptably large, enter a new period of training.
You'll always need to monitor the performance of Bayesian, but by using it to classify only for extended periods, you'll not have to monitor it as closely as you would if you were actively adding to the training at the same time.
|
|
Back to top |
|
|
|
|
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum
|
Powered by phpBB 2.0.8a © 2001 phpBB Group
Version 2.0.6 of PHP-Nuke Port by Tom Nitzschner © 2002 www.toms-home.com
Version 2.2 by Paul Laudanski © 2003-2004 Computer Cops
|