Coder Social home page Coder Social logo

camillereaves / fraud-detection-using-ml Goto Github PK

View Code? Open in Web Editor NEW

This project forked from robertnfisher/fraud-detection-using-ml

1.0 0.0 1.0 7.78 MB

Senior Capstone project incorporating Lexis Nexis's HPCC Systems and ECL language to develop a fraud detection model utilizing a static rule set and optional Logistic Regression or Agglomerative Hierarchical Clustering models.

ECL 95.19% Python 4.81%

fraud-detection-using-ml's Introduction

Fraud Detection Using Machine Learning

Senior Capstone Project with LexisNexis's HPCC Systems @Kennesaw State University

This project uses LexisNexis's HPCC Systems and ECL to analyze databases of credit card transactional data to detect fraud and anomalies. The project will first conduct data preprocessing and deterministic modeling using a static ruleset to mark the most obvious anomalous factors. Two different Machine Learning models will then be implemented, one with supervised learning using Logic Regression and one with unsupervised learning using an Agglomerative Hierarchical Clustering technique. These results will be compared to determine which is the better method, and will then use Python data visualization libraries to visualize and interpret the output into a “Client Report”.

Technologies Used: ECL with ECL ML_Core, LogisticRegression, & LearningTrees libraries; HPCC Systems; SQL; Python with libraries.


File Run Order:

First import transactions & identity (doesn't matter which you import first, but this HAS to be complete before running job 05)

For transactions:

1. 01_Data_Import
2. 02_Data_Import_Validate_Job
3. OPTIONAL: 03_Data_Patterns_Job
4. 04_Clean_Job

For identity:

1. identity/01_Data_Import
2. identity/02 Import_Validate_Job
3. OPTIONAL: identity/03_Data_Patterns_Job
4. identity/04_Clean_Job

Then you can run 05_Enrich_Data. After this, any of the models can be run.

This is the run order when you're running from scratch. If you have already imported, validated, and cleaned both transactions and identity data using the current code version, then you can ignore this somewhat.


Resources:

ML Core ECL library: hpcc-systems/ML_Core

Logistic Regression: hpcc-systems/LogisticRegression

Random Forest: hpcc-systems/LearningTrees

Python libraries: Pandas, NumPy, Seaborn, and matplotlib.pyplot

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.