Coder Social home page Coder Social logo

ffmg / myoddweb.classifier Goto Github PK

View Code? Open in Web Editor NEW
15.0 5.0 2.0 27.42 MB

Microsoft Outlook automatic Spam Classifier, Classify emails as they arrive and move them to your own folders. Works on Office 356, Outlook 2016,2013 and 2010

Home Page: http://www.myoddweb.com

License: MIT License

C# 86.02% C++ 10.99% C 0.48% Inno Setup 2.51%
classifier magnet classifier-engine spam spam-emails outlook bayesian bayes-classifier classifies-emails outlook-addin

myoddweb.classifier's Introduction

MyOddWeb.Classifier Release

What is it

MyOddWeb classifier, as it's name mentions, classifies emails as they arrive and move them to predefined folders.

Traditionally, rules are used to move emails to folders, for example an email from work, would be moved to the "Work" folder.

With a classifier you can create categories, "Personal", "NSFW", "Spam" and so on, and if a mail arrives, regardless who it is from, it will be moved to the appropriate folder.

Using the "Work" example above, if Bob from Human Resources sends you a joke email, a rule would have traditionally moved that email to the "Work" folder, (as Bob is a co-worker). With a Classifier, that email would be moved to a "Joke" folder and you can then look at it later.

How does it work

Documentation

Visit the wiki.

General

MyOddWeb is a Naive Bayes classifier, in simple terms, it makes an educated guess as to what category the email might belong in considering all the thousand of emails that you might have already received.

At first, you 'train' your classifier with your own "Categories", what you consider a "joke" might be considered "NSFW" by others.

The more you train it, the better the classifier gets.

But it does not take that long, really, invest a couple of day creating good Magnets and good Categories and in no time your classifier will be self sufficient.

MyOddWeb Classifier works with POP3, SMTP, NNTP proxies IMAP and exchanges.

Setup

From setup

  • Download the setup app, (get the latest release)
    • Make sure outlook is not running.
  • Run it
  • Start outlook
    • You will need to accept the certificate, (only first time).
  • Create some categories
  • Create some magnets
  • Categorise some emails.

From the code

  • Get the code
  • Build it using visual studio 2017 or later, (I tried with 2015 as well ... and it works(ish)).
  • You can either debug and/or simply run outlook.

Versions

0.9.1.1 (10/30/2019)

  • Updated to less restrictive MIT License
  • Added option to re-scan entire folder, (for whatever reason)
    • Added progress bar when loading/processing large number of mail.
  • Empty/no categories no longer hide the menu.
  • Some code cleanup
  • Updated to engine 1.7.3
    • Sqlite 3.30.1
  • Various code optimisations.
  • Various performance / memory fixes
  • Minor other improvements

0.8.0.1 (09/01/2019)

  • Some code cleanup
  • Updated to engine 1.7.0
    • Sqlite 3.26.0

0.7.0.5 (09/11/2017)

  • Added - Added an option to train items that are manually moved from one folder to another.
    • Also added option to ask the user if they want to select category in the case where we have more than one.
  • Added - On start up we now log version number + engine number.
  • Added Exception logging, (higher lever than error)
  • Updated to 1.6.5 engine.
    • Sqlite 3.21.0
  • Changed the way he handle emails when the wait time is 0, (we handle it right away).
  • Ignored conversations no longer cause messages to be classified, (as they are normally deleted)
  • Did some code cleanup
    • Changed from MSTest to NUnit, (because of Moq)
    • Create multiple interfaces for the engine.
  • Fixed an issue with logging, not all types of logs were added to the db.
  • Fixed a couple of colour issues in the option dialog.

0.7.0.2-0.7.0.4 (04/10/2017)

  • Started work on the wiki.

  • Added Minimum category percent.

  • We now 'timestamp' the last received time, that way, we can check emails that arrived while the classifier was not running, (in Exchanges for example).

    • Added CheckUnProcessedEmailsOnStartUp option in case you don't want that check, (if you are happy with the Exchange rules been applied or if you have another outlook applying rules already).
  • Updated to 1.6.3 engine + interface.

  • Added option to delay the classification, (to allow rules to fire).

    • Fixed a bug in the delayed emails (0.7.0.4).
  • Added option to set the number of entries we want to display.

  • Changed the behaviour of some option values to be read only, (you could not change the value anyway, so it was a bit misleading)

  • Changed GetLogEntries( ... ) to now return the entry ID as well.

    • The log summary now has the log id in it to make it easier to find.
  • Fixed an issue with the order of the last 'x' log entries been displayed, (was mostly un-ordered).

  • Fixed a option spelling mistake, "CheckIfUnownCategory" > "CheckIfUnKnownCategory".

  • Fixed wording of "Common word percent", (functionality is ok, just wording).

0.7.0.1 (17/09/2017)

  • Updated to 1.6.2 engine.
  • Added log options, (level + retention time)
  • Added log viewer.
  • Fixed setup to we don't check x64 registry.

0.6.0.1 - 0.6.0.5 (20/04/2017)

  • Added debug log messages to time how slow classification is.
  • Updated to classifier engine 1.5.7.0, (faster)
    • Added a couple of css values to no parse, (max-width/width etc...)
  • Updated the interface 0.6.0.5, (no change, just align the number)
  • Fixed a couple of issues with the 'detailed' html view.
    • Speed up the way the display is created.
    • Fixed actual html created, (invalid tables and so on).
  • Changed the way the interface loads Outlook folders, they are now loaded dynamically rather than at run time only.

0.5.0.1 - 0.5.0.2 (06/11/2016)

  • Option to automatically train classified mail, (false by default)
  • Option to automatically train classified mail using magnet, (true by default)
  • Updated to 1.5.2 engine, ( has better/faster classification)
    • Classifications are now threaded.
  • Options text now gives the version number + engine version number.

0.4.0.1 - 0.4.0.2 (02/11/2016)

  • Fixed a few issues in the tokenizer.
  • Made some changes in the engine with the lessons learned over the last 6 months.
    • More html code is now ignored.
  • We can now see the classification details to see each words classification.
    • Added a menu option to display raw-text classification.
    • Added a 'viewer' application to test raw text to confirm that the engine is working as expected.

0.3.0.5 (01/06/2016)

  • Fixed a couple of EventSource issues.
  • Changed the output name of the setup.exe so it includes the version number.

0.3.0.4 (17/02/2016)

  • Updated to new Classifier engine 1.1.0, (bug fixes)
    • New setting to remove very common words, ("Option.CommonWordsMinPercent").
  • Added GetEngineVersion() to use the new engine version.
  • Added possibility to change the common word percentage.

0.3.0.3 (01/02/2016)

  • Updated to new Classifier engine, (bug fixes)

0.3.0.2 (20/01/2016)

  • Removed a couple of outlook/html/xml words that could confuse the classifier.

0.3.0.1 (24/12/2015)

  • Fixed a typo in the name

0.3.0.0 (24/12/2015)

  • Added weights
  • Added option to set the magnets/user specified items weight.

0.2.1.0 (11/11/2015)

  • Removed dependency on .NET 4.5.2 and replaced it with more realistic 4.5.0

0.2.0.0 (09/11/2015)

  • By default we compile the x86 and x64 binaries.
  • Created 2 output folders.
  • We now automatically load the x64 or x86 dll depending on the version of outlook running.

0.1.0.0 (05/11/2015)

  • Beta
  • Initial release

FAQ

General

What are magnets

Magnets are like 'shortcuts' for your classifier, for example, if you get an email from your mom, you can create a magnet to always classify her emails has 'Personal', this will help the classifier as well as prevent some of her mails been wrongly classified. As a rule, the more magnet you have, the better.

NB: Should you add a magnet for spam emails? The short answer is no, spam emails come from thousands if not hundreds of thousands of servers. You cannot realistically create a magnet for all of them. But you could always create one for the repeat offender, personally, I don't have one.

How quickly does the app learn

It is fairly quick, but you should monitor the first few days to make sure that everything is on the right path. Eventually, it will keep on learning and you will not need to worry.

NB: You should always check your spam folder to make sure that nothing has been sent there by mistake.

How many magnets/categories can I have

There is no limit on the number of categories and/or magnets. But you have to be realistic and not have too many.

Weights

Weights are ways of marking an email more or less important depending on .

For example, if you use a magnet then the weight of that email is slightly more than when the engine tries to categorize the email itself.

When you categorize an email yourself, then this is considered to have the greatest weight, (who knows better than you what an email should be).

Database

Where is the database

The database is located in your "%appdata%\myoddweb\Classifier" folder.

What format is the database

It is a SQLite database

Can I clear my database

Just close outlook, delete the database and restart outlook.

Todo

  • Create some tutorials.
    • How to setup
    • Manage categories[0.5.0.4]
    • Manage Magnets[0.5.0.4]
    • Maybe create a youtube account and explain various features.
  • Give a link to the setup program on http://www.myoddweb.com
  • Update the site with more information.
  • Test on versions of Outlook, please add a message if you are able to test.
    • Test on Office 365[0.9.1.1]
    • Test on outlook 2016[0.6.0.5]
    • Test on outlook 2013[0.5.0.4]
    • Test on outlook 2010[0.5.0.4]
    • Test on outlook 2007
  • Test on 32bit versions of Outlook.
    • On 32bit machines[0.6.0.5]
    • On 64bit machines[0.6.0.5]
  • Test on 64bit versions of Outlook[0.6.0.5]
  • Ask someone to give us a certificate rather than the temp one we have.
  • Languages
  • Add copyright notice in front of all the files...

LICENSE

MIT License

myoddweb.classifier's People

Contributors

dependabot[bot] avatar ffmg avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

myoddweb.classifier's Issues

Releases for x86

Hey, would you be able to release the setup executables for 32bit?

No Mail Notification

It's not a real issue, but if the mail is classified and than moved to the correct folder, the Outlook Mail Notifications doesn't open anymore. So you get no notification about new mails.

Maybe could help to delay the move of the mails or the addon himself triggers the notification?

Thanks for the addon, thumbs up. I've searched very long for an alternative of "popfile" and "outclass" (outclass works only correct until Outlook 2003) ...

On first Outlook start - Inbox mails are not moved

Hi,

I use Outlook 2013 with Exchange Server.

Use Case:

  • shut down and end Outlook
  • shut down Laptop in standby at the evening
  • in the morning I start the Laptop and then I start Outlook
  • Outlook shows me new mails in the Inbox
  • the mails are classified, but after the delay (10s) the mails are not moved to the folders
  • when I close Outlook and start it again
  • the mails in the Inbox now moved in the folders after the delay
  • all other new mails then also moved

Can you check if you can replicate this?

Get classification information

We need to alter the engine to give details regarding the classification

  • Words ignored during classification
  • Percentage probability
  • Words count for document / Word count in category

It would be great if this could be exported in some way for investigation.

Maybe give this information as a 'property' of some sort with a details summary for each category.

'Details' html code not valid html code

The 'details' page has invalid code, the DOCTYPE is wrong and we are missing the <html> ... </html> code.

<!DOCTYPE html><style>... 

should be

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
  "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"><head>
....

And <style> should be <style type="text/css"> as well.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.