Coder Social home page Coder Social logo

an_analysis_of_features_which_have_the_most_influence_on_loan_outcomes's Introduction

Part II - An analysis of trends across Loan Status

by Oludare Adekunle

Dataset

Loan Data from Prosper: This data set contains 113,937 loans with 81 variables on each loan, including loan amount, borrower rate (or interest rate), current loan status, borrower income, and many others. See this data dictionary to understand the dataset's variables. Of the available features, in this project I chose to work with a subset of the data using the following features: Listing Creation Date, Term, Loan Status, Borrower Apr, Borrower Rate, Listing Category, Borrower State, Occupation, Employment Status, Employment Status Duration, Is Borrower Home Owner, Credit Score Range Lower, Credit Score Range Upper, Revolving Credit, Balance, Bank Card Utilization, Available Bank Card Credit, Debt To Income Ratio, Income Range, Income Verifiable, Stated Monthly Income, Monthly Loan Payment, Recommendations, Investors, Loan Original, Amount.

The aim of this project is to investigate how various factors inflence Loan outcomes listed in the loan status column.

Summary of Findings

From carrying our univariate, bivariate and multivariate analysis of the data, I found the following:

  1. From analysing the loan status, I found out that of the current observations in the data, 50% are currently under a type of loan, 33% completed the loan, while 17% belond to some state of defaulted, charged off or past due.
  2. 3588 (3.15%) of the occupation data was missing
  3. Non specific occupation types (other and professional) represents the most occupation type to borrow loans.
  4. Borrowers annual exchange rate has more unique values than borrowers rate as it shows an all year round moving average.
  5. Both borrowers annual percentage rate and borrowers rate appear to have a normal distribution
  6. The difference between the credit score range - upper and lower, is the start values and end values, of which the upper range offsets the lower range by 19
  7. there is a appear to be outliers credit score range - upper and lower
  8. Observations with a lower credit score greater than 600 and an upper credit score less than 820, represent the bulk of our observations.
  9. There are 133 outliers in the credit score range. There are observations abnormally low credit scores who were offered loans.
    Of these lots, approximately 60 have defaulted, 35 have been charged off while 38 have been completed.
  10. in almost all loan status category, I found that the reason for which our observation borrowed loans the most is for debt consolidation.
    However, when we look at the defaulted categorywhere we expect to find similar trend of debt consolidation topping the list we find that, observations where there is no record of what the loan is used for tops the chart.
  11. Observations for which we have no record of what they were used for, also takes the lead in the loans that have been charged off.
  12. Observations in the completed loans category, are clustered more around the lower boundary of the borrower rate and have a median borrower rate of less than 0.2.
    This trend is similar with Current loans category of loans
  13. observations in the defaulted, cancelled, past due categories of the loan status features are cluster more towards the upper boundaries of the borrower rate and generally have a median borrower rate of greater than 0.2
  14. The credit score does not show any clear relationship between the
  15. Most current loans are borrowed by employed.
  16. Full time category show the best performance on loan completion but also the worst as it has the highest defaulted and charged off loans.
  17. The not-displayed and 0 category of the income range have an unusually high number of charged off or defaulted.
  18. Current loans have a higher median amount than any other category
  19. Average monthly income of those who completed their payments do not vary much against those who did not defaulted or had thier loans Charged off.
  20. Generally, there is a less borrower rate for home-owners.
  21. Distribution of our observation across the different status is thesame for both home-owner and non-home-owner except in the defaulted status where concentration of observations of home-owners that defaulted have high borrower rate, while reverese is the case for non-home-owners.
  22. Regardess of home-owner status, low borrower rate tends plays a role in enabling completion of loan status.
  23. From the plot we can see that the relationship is a little less negative for Home owners when compared with non home owners.
  24. Across the different Loan status, we see that lower term loans have lower interest rates.
  25. Accross all loan terms, completed loans frequencies are clustered around lower rates while frequencies for past due defaulted are clustered around higher interest rates.

Key Insights for Presentation

The goal of my Exploration was to determine what factors influence the outcomes an offered loan in order To improve collections on loans offered. Thus, the key insights for my presentation are listed below:

  1. Current distribution of all offered loans across loan staus catogories
  2. The effect borrower rates have on loan status outcomes
  3. The effect the reason for which a loan is borrowed has on loan status outcomes
  4. The effect the eployment status of an individual has on loan status outcomes
  5. the relationship between loan terms and borrower rate on loan status outcomes.

an_analysis_of_features_which_have_the_most_influence_on_loan_outcomes's People

Contributors

dareadekunle avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.