View previous topic :: View next topic |
Author |
Message |
m4gician
Cadet
Joined: Jun 01, 2004
Posts: 2
Location: Uk
|
Posted: Tue Jun 01, 2004 11:03 am Post subject: Everything is spam? |
|
|
I have been using Mailwasher for a while now but recently it has developed a fault. Every email is listed as "Probable spam" or "known Spam" even when I know who is sending it. I want to teach MW to automatically bounce and delete known spam or probable spam but that's dangerous now because it thinks *everything* is spam. Also, in the 'Learning' column, only the envelope is visibale and the bin has disappeared.
If anyone can help me get back to normal, it would be mucha appreciated.
M4gician |
|
Back to top |
|
|
rogerw
Major
Premium Member
Joined: May 11, 2003
Posts: 857
Location: USA
|
Posted: Tue Jun 01, 2004 11:26 am Post subject: |
|
|
Sounds like you really don't have an appreciation of what the Bayesian (learning) filters can do for you and how they work.
From your description, you've 'trained' the feature with a great number of 'junk' emails and not enough 'legitiamte' emails so that MW is tending to score everything as 'junk'.
If you turn the learning feature off, you'll get back the prior functionality. To turn it off, navigate Tools>Options>Learning then unset the Checkboxes on each of the 3 property sheets there.
Once the feature is off, you can read up on the Bayesian to see how it works and if you want to try using it later. |
|
Back to top |
|
|
m4gician
Cadet
Joined: Jun 01, 2004
Posts: 2
Location: Uk
|
Posted: Wed Jun 02, 2004 7:06 am Post subject: |
|
|
The learning feature was fine and I was teaching it well but then it went haywire a week or so ago. Is it simply a case of training it with too many junk emails?
Do you happen to know why the dustbin icon has vanished from the learning column, leaving only the envelope? |
|
Back to top |
|
|
Ikeb
General
Premium Member
Joined: Apr 20, 2003
Posts: 3555
Location: Canada
|
Posted: Wed Jun 02, 2004 9:38 am Post subject: |
|
|
I had a similar thing happen but admittedly not the same implications. In my case, after several weeks of good Learning Spam Tool (LST) performance, for over a week most msgs were being left as "Unknown". I just kept training the LST and it is now behaving itself.
The 'envelope' (on my screen and with my poor eyesight I can't even make out the symbol) signifies an opportunity to "train" the LST to consider the msg as Legit. That the "trashbin" is missing indicates that the msg is currently considered Junk already and there's no need to train as Junk.
_________________
I like SPAM ... on my sandwich! |
|
Back to top |
|
|
rogerw
Major
Premium Member
Joined: May 11, 2003
Posts: 857
Location: USA
|
Posted: Wed Jun 02, 2004 10:34 am Post subject: |
|
|
m4gician wrote: |
Is it simply a case of training it with too many junk emails? |
That is likely the case.
The Learing tool doesn't match phrases in email like the filter tool does. Rather, it gathers statistics on individual words within what you train as 'good' mail and 'junk' - how freqently they occur in junk mail and legit mail - then evaluates incoming mail based upon a weighting factors of the same words in the incoming mail.
If you train with a disproportionate number of junk mails with respect to good mail, then the 'weighting' of all the words in the data built up will favor things being classified as junk.
When this happens, you'll need to add to the 'legitimate' training so that MW has a better statistical sample of junk/legit and a better database of words to better make choices for you.
|
|
Back to top |
|
|
stan_qaz
General
Premium Member
Joined: Mar 31, 2003
Posts: 4112
Location: USA
|
Posted: Wed Jun 02, 2004 11:34 am Post subject: |
|
|
You might have a file corruption problem too, close MW and delete the training.dat file, MW will recreate it next time you start up.
If that doesn't help try deleting all the training files and start over. |
|
Back to top |
|
|
stapel
Trooper
Joined: Apr 23, 2004
Posts: 19
Location: USA
|
Posted: Fri Jun 04, 2004 12:45 pm Post subject: |
|
|
I've had periodic problems with Mailwasher 4.1 suddenly marking obvious spam as "Probably Legitimate" (even if it is letter-for-letter the same as another message marked "Known Spam"), or not marking the "Known Spam" for blacklisting and deletion. But when I delete the Mailwasher "Training" files, this seems to reset Mailwasher to sensible behavior for another couple weeks. So try deleting the "Training" files and see if this helps.
Just my $0.02.
Eliz. |
|
Back to top |
|
|
AlphaCentauri
Captain
Joined: Nov 20, 2003
Posts: 302
Location: USA
|
Posted: Tue Jun 08, 2004 2:24 pm Post subject: |
|
|
rogerw wrote: |
If you train with a disproportionate number of junk mails with respect to good mail, then the 'weighting' of all the words in the data built up will favor things being classified as junk. |
Does that mean that if I get spam that has only a few quotes from Bartlett's quotations and links to an image and a URL, I should tell the learning filter that it is legit?
|
|
Back to top |
|
|
rogerw
Major
Premium Member
Joined: May 11, 2003
Posts: 857
Location: USA
|
Posted: Tue Jun 08, 2004 3:01 pm Post subject: |
|
|
AlphaCentauri wrote: |
Does that mean that if I get spam that has only a few quotes from Bartlett's quotations and links to an image and a URL, I should tell the learning filter that it is legit? |
Certainly not! ... and the uncommon words that might be culled from dictionaries aren't the problem.
I know you're pointing out that an image-based spam with a few words of text will be hard for Bayesian to classify in the first place - and it's a good point to make for those less familiar. The other tools may be better at trapping such an email.
For the benefit of that same group, I'll restate to:
As a statistical method of classifying, there needs to be a statistically significant sampling of good and junk mails so that common words (ones that appear in both junk and legit emails) don't wind up with a weighting that will tend to skew the assignments. Training with practically all junk and few legit mails will not allow the occurance of common words in in the good mail offset the junk, and you'll wind up having a majority of emails classified as junk.
|
|
Back to top |
|
|
stan_qaz
General
Premium Member
Joined: Mar 31, 2003
Posts: 4112
Location: USA
|
Posted: Wed Jun 09, 2004 10:13 am Post subject: |
|
|
I just train anything that isn't classified and retrain anything that is misclassified with very good results.
Haven't had a misclassification in some time now. |
|
Back to top |
|
|
|