damonlee92 / build_your_own_ai_investor_2021 Goto Github PK

View Code? Open in Web Editor NEW

39.0 39.0 42.0 4.93 MB

AI Investor

Home Page: https://www.valueinvestingai.com

License: GNU General Public License v3.0

Jupyter Notebook 100.00%

build_your_own_ai_investor_2021's People

Contributors

Stargazers

Watchers

Forkers

greatzt aripirala urlgirl birol-yildiz limitlessmatrix houdinii davidnwiley 1chr1s kcirtapcapital srknates vicchu gitbowei luiz158 pin00ch mangacal rstanc hrocha volade rickeyestes semani2 davefye andreimartynenko dannyle01 dgumbo jars83 doken-tokuyama cyans tstampfel mikpim01 otsi-lab learning-mlops2022 joechampion williamcwatkins prw361 marengifom rickeyestes2 596050 andrewdavidbell rogerolowski rkmyst chuckthenerd diomedesdigital

build_your_own_ai_investor_2021's Issues

Error in 4_Backtesting_AltmanZ, the 0 sample(s) problem

Hi I try to follow the tut and I meet an issue:

When I ran below function, then the error occurred.
The sample data looks like missing, but I checked 2019's data should be existed.

code:
backTest = getPortTimeSeries(y_withData_Test, X_test,
daily_stock_prices_data,
trained_model_pipeline)

output:
Backtest performance for year starting 2016-12-31 00:00:00 is: 33.03 %
With stocks: ['UPS' 'LII' 'LNTH' 'BA' 'WK' 'INVA' 'NVAX']
UPS Performance was: 11.97 %
LII Performance was: 26.56 %
LNTH Performance was: 67.59 %
BA Performance was: 75.78 %
WK Performance was: 48.8 %
INVA Performance was: 21.69 %
NVAX Performance was: -21.19 %

Backtest performance for year starting 2017-12-31 00:00:00 is: 88.63 %
With stocks: ['LII' 'VSM' 'TLND' 'EGAN' 'TNDM' 'W' 'WNDW']
LII Performance was: 0.88 %
VSM Performance was: -16.76 %
TLND Performance was: -31.48 %
EGAN Performance was: 1.87 %
TNDM Performance was: 725.45 %
W Performance was: 5.47 %
WNDW Performance was: -65.0 %

Backtest performance for year starting 2018-12-31 00:00:00 is: -25.68 %
With stocks: ['ENDP' 'BA' 'DBD' 'CHRS' 'PHUN' 'MNKD' 'WK']
ENDP Performance was: -61.57 %
BA Performance was: -19.2 %
DBD Performance was: 18.67 %
CHRS Performance was: 22.14 %
PHUN Performance was: -97.69 %
MNKD Performance was: -29.95 %
WK Performance was: -12.19 %

ValueError Traceback (most recent call last)
in
----> 1 backTest = getPortTimeSeries(y_withData_Test, X_test,
2 daily_stock_prices_data,
3 trained_model_pipeline)
4
5 print('Performance is: ',

in getPortTimeSeries(y_withData, X, daily_stock_prices, ml_model_pipeline, verbose)
19
20 [comp, this_year_perf, ticker_list] =
---> 21 getPortTimeSeriesForYear(curr_date, y_withData, X,
22 daily_stock_prices, ml_model_pipeline)
23

in getPortTimeSeriesForYear(date_starting, y_withData, X, daily_stock_prices, ml_model_pipeline)
26
27 # Get return prediction from model
---> 28 y_pred = ml_model_pipeline.predict(X[thisYearMask])
29
30 # Make it a DataFrame to select the top picks

C:\ProgramData\Anaconda3\lib\site-packages\sklearn\utils\metaestimators.py in (*args, **kwargs)
118
119 # lambda, but not partial, allows help() to work with update_wrapper
--> 120 out = lambda *args, **kwargs: self.fn(obj, *args, **kwargs)
121 # update the docstring of the returned function
122 update_wrapper(out, self.fn)

C:\ProgramData\Anaconda3\lib\site-packages\sklearn\pipeline.py in predict(self, X, **predict_params)
416 Xt = X
417 for _, name, transform in self._iter(with_final=False):
--> 418 Xt = transform.transform(Xt)
419 return self.steps[-1][-1].predict(Xt, **predict_params)
420

C:\ProgramData\Anaconda3\lib\site-packages\sklearn\preprocessing_data.py in transform(self, X)
3094 """
3095 check_is_fitted(self)
-> 3096 X = self._check_input(X, in_fit=False, check_positive=True,
3097 check_shape=True)
3098

C:\ProgramData\Anaconda3\lib\site-packages\sklearn\preprocessing_data.py in _check_input(self, X, in_fit, check_positive, check_shape, check_method)
3268 If True, check that the transformation method is valid.
3269 """
-> 3270 X = self._validate_data(X, ensure_2d=True, dtype=FLOAT_DTYPES,
3271 copy=self.copy, force_all_finite='allow-nan',
3272 reset=in_fit)

C:\ProgramData\Anaconda3\lib\site-packages\sklearn\base.py in _validate_data(self, X, y, reset, validate_separately, **check_params)
419 out = X
420 elif isinstance(y, str) and y == 'no_validation':
--> 421 X = check_array(X, **check_params)
422 out = X
423 else:

C:\ProgramData\Anaconda3\lib\site-packages\sklearn\utils\validation.py in inner_f(*args, **kwargs)
61 extra_args = len(args) - len(all_args)
62 if extra_args <= 0:
---> 63 return f(*args, **kwargs)
64
65 # extra_args > 0

C:\ProgramData\Anaconda3\lib\site-packages\sklearn\utils\validation.py in check_array(array, accept_sparse, accept_large_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, ensure_min_samples, ensure_min_features, estimator)
667 n_samples = _num_samples(array)
668 if n_samples < ensure_min_samples:
--> 669 raise ValueError("Found array with %d sample(s) (shape=%s) while a"
670 " minimum of %d is required%s."
671 % (n_samples, array.shape, ensure_min_samples,

ValueError: Found array with 0 sample(s) (shape=(0, 18)) while a minimum of 1 is required.

Error in Chapter 4 to 6/ 2 Process X_Y Learning Data in the last paragraph

Hi there,

I am really enjoying your book unfortunately there is a mistake in the last paragraph in X[Net Income] = X[Net Income_x] line 22.

And also one in the first of the section (filtering raw data) in line 51 and X has to be x or x in def getYPricesReportDateAndTargetDate(x, d, modifier=365) has to be X

Added filter to filter out rows with negative revenue

I added a filter to remove rows with negative revenue which improved results.

X[(X['Revenue']<0)]# rows with revenue issues

Issue where revenue is negative

bool_list5 = ~((X['Revenue']<0))

y=y[bool_list4 & bool_list5]
X=X[bool_list4 & bool_list5]

In Chapter 2 - Exercise 3, the description of the Exercise does not match the solution

The description of Exercise 3 is:
# Output stocks with a price/earnings ratio below a number and a beta above a number.
# return a list of aceptable stocks with ascending by P/E

The function 'filterStocks' actually differs from the description in two ways:

It tests for beta below a number, not above;
It doesn't sort the acceptable stocks ascending by P/E

Error in Chapter 4 to 6/ 2 Process X_Y Learning Data fixNansInX function

For some reason when I run this part of the code as stated in the book I get the following error. Any idea why it doesn't work anymore or how to fix it?

def fixNansInX(x):
    '''
    Takes in x DataFrame, edits it so that important keys
    are 0 instead of NaN.
    '''
    keyCheckNullList = ["Short Term Debt" ,\
            "Long Term Debt" ,\
            "Interest Expense, Net",\
            "Income Tax (Expense) Benefit, Net",\
            "Cash, Cash Equivalents & Short Term Investments",\
            "Property, Plant & Equipment, Net",\
            "Revenue",\
            "Gross Profit",\
            "Total Current Liabilities",\
            "Property, Plant & Equipment, Net"]
    x[keyCheckNullList]=x[keyCheckNullList].fillna(0)

[addColsToX(x_)](url)
fixNansInX(x_)
X=getXRatios(x_)
fixXRatios(X)

y=getYPerf(y_)

______________________________________________________________________________________

ValueError                                Traceback (most recent call last)
Input In [134], in <cell line: 3>()
      1 # From x_ (raw fundamental data) get X (stock fundamental ratios)
      2 addColsToX(x_)
----> 3 fixNansInX(x_)
      4 X=getXRatios(x_)
      5 fixXRatios(X)

Input In [133], in fixNansInX(x)
      2 '''
      3 Takes in x DataFrame, edits it so that important keys
      4 are 0 instead of NaN.
      5 '''
      6 keyCheckNullList = ["Short Term Debt" ,\
      7         "Long Term Debt" ,\
      8         "Interest Expense, Net",\
   (...)
     14         "Total Current Liabilities",\
     15         "Property, Plant & Equipment, Net"]
---> 16 x[keyCheckNullList]=x[keyCheckNullList].fillna(0)

File c:\users\danie\appdata\local\programs\python\python39\lib\site-packages\pandas\core\frame.py:3643, in DataFrame.__setitem__(self, key, value)
   3641     self._setitem_frame(key, value)
   3642 elif isinstance(key, (Series, np.ndarray, list, Index)):
-> 3643     self._setitem_array(key, value)
   3644 elif isinstance(value, DataFrame):
   3645     self._set_item_frame_value(key, value)

File c:\users\danie\appdata\local\programs\python\python39\lib\site-packages\pandas\core\frame.py:3687, in DataFrame._setitem_array(self, key, value)
   3685     check_key_length(self.columns, key, value)
   3686     for k1, k2 in zip(key, value.columns):
-> 3687         self[k1] = value[k2]
   3689 elif not is_list_like(value):
   3690     for col in key:

File c:\users\danie\appdata\local\programs\python\python39\lib\site-packages\pandas\core\frame.py:3645, in DataFrame.__setitem__(self, key, value)
   3643     self._setitem_array(key, value)
   3644 elif isinstance(value, DataFrame):
-> 3645     self._set_item_frame_value(key, value)
   3646 elif (
   3647     is_list_like(value)
   3648     and not self.columns.is_unique
   3649     and 1 < len(self.columns.get_indexer_for([key])) == len(value)
   3650 ):
   3651     # Column to set is duplicated
   3652     self._setitem_array([key], value)

File c:\users\danie\appdata\local\programs\python\python39\lib\site-packages\pandas\core\frame.py:3775, in DataFrame._set_item_frame_value(self, key, value)
   3773 len_cols = 1 if is_scalar(cols) else len(cols)
   3774 if len_cols != len(value.columns):
-> 3775     raise ValueError("Columns must be same length as key")
   3777 # align right-hand-side columns if self.columns
   3778 # is multi-index and self[key] is a sub-frame
   3779 if isinstance(self.columns, MultiIndex) and isinstance(
   3780     loc, (slice, Series, np.ndarray, Index)
   3781 ):

ValueError: Columns must be same length as key

damonlee92 / build_your_own_ai_investor_2021 Goto Github PK

build_your_own_ai_investor_2021's People

Contributors

Stargazers

Watchers

Forkers

build_your_own_ai_investor_2021's Issues

Error in 4_Backtesting_AltmanZ, the 0 sample(s) problem

Error in Chapter 4 to 6/ 2 Process X_Y Learning Data in the last paragraph

Added filter to filter out rows with negative revenue

Issue where revenue is negative

In Chapter 2 - Exercise 3, the description of the Exercise does not match the solution

Error in Chapter 4 to 6/ 2 Process X_Y Learning Data fixNansInX function

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent