For this lab, we will be using the same notebook as in the previous lab.
- Case Study
- Get data
- Cleaning/Wrangling/EDA
- Processing Data
- Modeling -Validation
- Reporting
-
Open the notebook created for Lab-cleaning-numerical-data.
-
Find all of the categorical data. Save it in a categorical_df variable.
-
Check for NaN values and decide what to do with them, do it now.
-
Check all unique values of columns.
-
Check dtypes. Do they all make sense as categorical data?
-
Does any column contain alpha and numeric data? Decide how to clean it and do it now.
-
Would you choose to do anything else to clean or wrangle the categorical data? Comment your decisions and do it now.
-
Compare policy_type and policy. What information is contained in these columns. Can you identify what is important?
-
Check number of unique values in each column, can they be combined in any way to ease encoding? Comment your thoughts and make those changes.
-
Save the cleaned catagorical dataframe as categorical.csv You will use this file again this week.