data mining project for everyone
- Import libraries
- Import data
- Read from csv
- Connect to a database
- Metadata and data types
- Metadata
- Modify data types
- Duplicate detection
- Find duplicates
- Remove duplicates
- Dataframe manipulation
- Columns and rows
- Aggregation, filtering and sorting
- Combine dataframes
- Outlier detection
- Tukey’s test for extreme values
- Kernel density estimation
- Variable generation and manipulation
- Generate new variables
- Bucket variables
- Encode categorical variables
- Generate dummy variables
- Prepararation of data for modeling
- Draw samples and split dataset
- Reshape data for modeling