Class Imbalance for ML
Is it possible to distinguishe between bad and good connections based on given features?
You are involved in a project where you are tasked to build a machine learning algorithm that distinguishes between "bad'' connections (called intrusions or attacks) and "good'' (normal) connections. Note that the number of normal connections is greater than that of bad ones.
The analysis is is divided the following way:
- Exploration of the 39 numerical variables
- Exploration of the 3 categorical variables
- Input Data
- Normalized Variables
- One-hot encode
- Split Data Set
- Train & Test Model
- Evaluate Model Performance
- Class Imbalance