gcflymoto / lendingclub Goto Github PK
View Code? Open in Web Editor NEWLending Club Data Analysis and Algorithms
Lending Club Data Analysis and Algorithms
Running with the following parameters
lbct.py -f 10 -b 20 -w 1
I get results like this:
Matched 14/466287 loans (0/mo.) test at 23.27% APY, 4 loans defaulted (28.00%, $0.00 avg loss) 25.4963% net APY.
It doesn't seem to be calculating the avg loss correctly, therefore the net_apy is off. I have been digging around in the code, but I haven't figured out where this is being calculated as of yet. Any pointers would be helpful.
seems like is_inc_v is no longer being used in the csv files so it keeps failing at start
Looking at the NAR calculations, I assume it's supposed to be following that NAR formula provided by Lending Club. However, that formula is totaling valued as summed up over each monthly installment. Thus for each month of the loan the principle diminishes and is returned for the user. The formula you're using in LoanData doesn't appear to have any concept of time. It instead seems to assume that all profits and losses occur in a single month. This should give very skewed results for NAR.
Does that analysis seem correct to you?
Looks like some kind of parse/initialization error.
Has formatting changed?
Downloading: LoanStats3c.csv.zip
Worker[-1] Initializing from LoanStats3a.csv inside LoanStats3a.csv.zip ...
Error in row 3
mths_since_last_delinq:
inq_last_6mths: 1
grade: B
annual_inc: 24000
total_acc: 9
out_prncp: 0.00
emp_length: 10+ years
total_pymnt: 5861.071414249
out_prncp_inv: 0.00
pub_rec: 0
revol_util: 83.7%
total_rec_prncp: 5000.00
earliest_cr_line: Jan-1985
delinq_2yrs: 0
open_acc: 3
dti: 27.65
purpose: credit_card
addr_state: AZ
desc: Borrower added on 12/22/11 > I need to upgrade my business technolog
ies.
term: 36 months
total_rec_int: 861.07
installment: 162.87
int_rate: 10.65%
funded_amnt: 5000
loan_status: Fully Paid
home_ownership: RENT
issue_d: Dec-2011
Traceback (most recent call last):
File "lcbt.py", line 834, in
sys.exit(main())
File "lcbt.py", line 790, in main
lcbt.initialize()
File "lcbt.py", line 534, in initialize
self.test.get_loan_data().initialize()
File "C:\Users\Dean\Dropbox\Genetic Algorithm\LendingClub-master\LendingClub-m
aster\py\LoanData.py", line 67, in initialize
self.load_data()
File "C:\Users\Dean\Dropbox\Genetic Algorithm\LendingClub-master\LendingClub-m
aster\py\SqliteLoanData.py", line 24, in load_data
LoanData.LCLoanData.load_data(self)
File "C:\Users\Dean\Dropbox\Genetic Algorithm\LendingClub-master\LendingClub-m
aster\py\LoanData.py", line 107, in load_data
self.parse_lc_csv(csv_reader, loans, loans_info)
File "C:\Users\Dean\Dropbox\Genetic Algorithm\LendingClub-master\LendingClub-m
aster\py\LoanData.py", line 134, in parse_lc_csv
loan, loan_info, parsed_loan_ok = self.normalize_loan_data(raw_loan)
File "C:\Users\Dean\Dropbox\Genetic Algorithm\LendingClub-master\LendingClub-m
aster\py\LoanData.py", line 214, in normalize_loan_data
raw_loan[conversion_filters[LOAN_ENUM_income_validated].name])
KeyError: 'is_inc_v'
LoanStatsNew.csv is no longer at https://www.lendingclub.com/fileDownload.action?file=LoanStatsNew.csv&type=gen
Instead it is hosted as a zip file at https://resources.lendingclub.com/LoanStats3c.csv.zip
It also appears the issued date format has changed, and as such loans are able to be parsed. The new issue_d column contains mmm-YYYY. For example Sep-2014 or Aug-2013.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.