Primary intended uses: This model is an example probability of default classifier, with an example use case for determining eligibility for a credit line increase.
Primary intended users: Students in GWU DNSC 6301 bootcamp.
Out-of-scope use cases: Any use beyond an educational example is out-of-scope.
history of past payment; PAY_0 = the repayment status in September, 2005; PAY_2 = the repayment status in August, 2005; ...; PAY_6 = the repayment status in April, 2005. The measurement scale for the repayment status is: -1 = pay duly; 1 = payment delay for one month; 2 = payment delay for two months; ...; 8 = payment delay for eight months; 9 = payment delay for nine months and above
BILL_AMT1 - BILL_AMT6
inputs
float
amount of bill statement; BILL_AMNT1 = amount of bill statement in September, 2005; BILL_AMT2 = amount of bill statement in August, 2005; ...; BILL_AMT6 = amount of bill statement in April, 2005
PAY_AMT1 - PAY_AMT6
inputs
float
amount of previous payment; PAY_AMT1 = amount paid in September, 2005; PAY_AMT2 = amount paid in August, 2005; ...; PAY_AMT6 = amount paid in April, 2005
DELINQ_NEXT
target
int
whether a customer's next payment is delinquent (late), 1 = late; 0 = on-time
Source of training data: GWU Blackboard, email [email protected] for more information
How training data was divided into training and validation data: 50% training, 25% validation, 25% test
Number of rows in training and validation data:
Training rows: 15,000
Validation rows: 7,500
Test Data
Source of test data: GWU Blackboard, email [email protected] for more information
Number of rows in test data: 7,500
State any differences in columns between training and test data: None