In this lab, we will further investigate some comparisons between our personal logistic regression implementation, that of sci-kit learn and further tuning parameters that can be adjusted in the model.
- Understand and implement logistic regression
- Compare logistic model outputs
In the previous lab, we were able to recreat a logistic regression model output from sci-kit learn that did not include an intercept of regularization. Here, you will continue to analyze the impact of several tuning parameters including the intercept, and regularization parameter which we have not discussed previously.
As with the previous lab, import the dataset stored in heart.csv
#Your code here
Define X and y as with the previous lab. This time, follow best practices and also implementk a standard train-test split.
For consistency of results, use random_state=17.
#Your code here
Use your code from the previous lab to once again train a logistic regression algorithm on the training set.
# Your code here
#Your code here
#Your code here
# Your code here
Use a standard decision boundary of .5 to convert your probabilities output by logistic regression into binary classifications. (Again this should be for the test set.) Afterwards, feel free to use the built in sci-kit learn methods to compute the confusion matrix as we discussed in previous sections.
# Your code here
Do the same using the built in method from sci-kit learn. To start, create an identical model as you did in the last section; turn off the intercept and set the regularization parameter, C, to a ridiculously large number such as 1e16.
# Your code here
#Your code here
#Your code here
Now add an intercept to the sci-kit learn model. Keep the regularization parameter C set to a very large number such as 1e16. Plot all three models ROC curves on the same graph.
# Your code here
Now, experiment with altering the regularization parameter. At minimum, create 5 different subplots with varying regularization (C) parameters. For each, plot the ROC curve of the train and test set for that specific model.
Regularization parameters between 1 and 20 are recommended. Observe the difference in test and train auc as you go along.
# Your code here
#Your response here
In this lesson, we reviewed many of the accuracy measures of classification algorithms and observed the impact of additional tuning parameters such as regularization.