Coder Social home page Coder Social logo

ds-logistic-tuning-lab-qa-internal's Introduction

import pandas as pd
import numpy as np
import itertools

from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression

from sklearn.metrics import roc_curve, auc
from sklearn.metrics import confusion_matrix

from imblearn.over_sampling import SMOTE, ADASYN

import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline

Predicting Credit Card Fraud

Load the creditcard.csv file, split into training and test sets, and fit a logistic regression model to the training data.
Then plot the ROC curve and confusion matrix for your test sets.

# here we load a compressed csv file.
df = None
# inspect the first few lines
df.head()

Count the number of instances in each class

# your code here

Seperate the class column (y) from the rest of the data set (X) and use train_test_split() to create a train and a test set.

X = df[df.columns[:-1]]
y = df.Class
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=0)

Use scikit-learns LogisticRegression() and get the true positive rate, false positive rate and thresholds using roc_curve().

logreg = None
y_score = None
# get tpr, fpr and thresholds

Create an ROC plot using seaborn.

# Create seaborn plot here
Plot a confusion matrix here.
#Create a function for a confusion matrix here. Make sure to add a normalization option
def plot_confusion_matrix(cm, classes, normalize=False, title='Confusion matrix', cmap=plt.cm.Blues):
    None

make y_hat_test predictions and create the confusion matrix using confusion_matrix. Then use your newly created function.

y_hat_test = None
cnf_matrix = None
# use new plot_confusion_matrix() function

Tuning

Try some of the various techniques proposed to tune your model. Compare your models using AUC, ROC or another metric. Use different values for normalization weights first and visualize the results.

# Now let's compare a few different regularization performances on the dataset:
# plot the result

SMOTE

Repeat what you did before but now using the SMOTE class from the imblearn package in order to improve the model's performance on the minority class.

print(y_train.value_counts()) #Previous original class distribution
# Resample X_train and y_train here
print(pd.Series(y_train_resampled).value_counts()) #Preview synthetic sample class distribution
# Now let's compare a few different regularization performances on the dataset using SMOTE
# plot the result

Analysis

Describe what is misleading about the AUC score and ROC curves produced by this code.

ds-logistic-tuning-lab-qa-internal's People

Contributors

fpolchow avatar loredirick avatar mathymitchell avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Forkers

jc-b joeganser

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.