Coder Social home page Coder Social logo

build_your_own_ai_investor_2021's People

Contributors

damonlee92 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

build_your_own_ai_investor_2021's Issues

Error in 4_Backtesting_AltmanZ, the 0 sample(s) problem

Hi I try to follow the tut and I meet an issue:

When I ran below function, then the error occurred.
The sample data looks like missing, but I checked 2019's data should be existed.

code:
backTest = getPortTimeSeries(y_withData_Test, X_test,
daily_stock_prices_data,
trained_model_pipeline)

output:
Backtest performance for year starting 2016-12-31 00:00:00 is: 33.03 %
With stocks: ['UPS' 'LII' 'LNTH' 'BA' 'WK' 'INVA' 'NVAX']
UPS Performance was: 11.97 %
LII Performance was: 26.56 %
LNTH Performance was: 67.59 %
BA Performance was: 75.78 %
WK Performance was: 48.8 %
INVA Performance was: 21.69 %
NVAX Performance was: -21.19 %

Backtest performance for year starting 2017-12-31 00:00:00 is: 88.63 %
With stocks: ['LII' 'VSM' 'TLND' 'EGAN' 'TNDM' 'W' 'WNDW']
LII Performance was: 0.88 %
VSM Performance was: -16.76 %
TLND Performance was: -31.48 %
EGAN Performance was: 1.87 %
TNDM Performance was: 725.45 %
W Performance was: 5.47 %
WNDW Performance was: -65.0 %

Backtest performance for year starting 2018-12-31 00:00:00 is: -25.68 %
With stocks: ['ENDP' 'BA' 'DBD' 'CHRS' 'PHUN' 'MNKD' 'WK']
ENDP Performance was: -61.57 %
BA Performance was: -19.2 %
DBD Performance was: 18.67 %
CHRS Performance was: 22.14 %
PHUN Performance was: -97.69 %
MNKD Performance was: -29.95 %
WK Performance was: -12.19 %


ValueError Traceback (most recent call last)
in
----> 1 backTest = getPortTimeSeries(y_withData_Test, X_test,
2 daily_stock_prices_data,
3 trained_model_pipeline)
4
5 print('Performance is: ',

in getPortTimeSeries(y_withData, X, daily_stock_prices, ml_model_pipeline, verbose)
19
20 [comp, this_year_perf, ticker_list] =
---> 21 getPortTimeSeriesForYear(curr_date, y_withData, X,
22 daily_stock_prices, ml_model_pipeline)
23

in getPortTimeSeriesForYear(date_starting, y_withData, X, daily_stock_prices, ml_model_pipeline)
26
27 # Get return prediction from model
---> 28 y_pred = ml_model_pipeline.predict(X[thisYearMask])
29
30 # Make it a DataFrame to select the top picks

C:\ProgramData\Anaconda3\lib\site-packages\sklearn\utils\metaestimators.py in (*args, **kwargs)
118
119 # lambda, but not partial, allows help() to work with update_wrapper
--> 120 out = lambda *args, **kwargs: self.fn(obj, *args, **kwargs)
121 # update the docstring of the returned function
122 update_wrapper(out, self.fn)

C:\ProgramData\Anaconda3\lib\site-packages\sklearn\pipeline.py in predict(self, X, **predict_params)
416 Xt = X
417 for _, name, transform in self._iter(with_final=False):
--> 418 Xt = transform.transform(Xt)
419 return self.steps[-1][-1].predict(Xt, **predict_params)
420

C:\ProgramData\Anaconda3\lib\site-packages\sklearn\preprocessing_data.py in transform(self, X)
3094 """
3095 check_is_fitted(self)
-> 3096 X = self._check_input(X, in_fit=False, check_positive=True,
3097 check_shape=True)
3098

C:\ProgramData\Anaconda3\lib\site-packages\sklearn\preprocessing_data.py in _check_input(self, X, in_fit, check_positive, check_shape, check_method)
3268 If True, check that the transformation method is valid.
3269 """
-> 3270 X = self._validate_data(X, ensure_2d=True, dtype=FLOAT_DTYPES,
3271 copy=self.copy, force_all_finite='allow-nan',
3272 reset=in_fit)

C:\ProgramData\Anaconda3\lib\site-packages\sklearn\base.py in _validate_data(self, X, y, reset, validate_separately, **check_params)
419 out = X
420 elif isinstance(y, str) and y == 'no_validation':
--> 421 X = check_array(X, **check_params)
422 out = X
423 else:

C:\ProgramData\Anaconda3\lib\site-packages\sklearn\utils\validation.py in inner_f(*args, **kwargs)
61 extra_args = len(args) - len(all_args)
62 if extra_args <= 0:
---> 63 return f(*args, **kwargs)
64
65 # extra_args > 0

C:\ProgramData\Anaconda3\lib\site-packages\sklearn\utils\validation.py in check_array(array, accept_sparse, accept_large_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, ensure_min_samples, ensure_min_features, estimator)
667 n_samples = _num_samples(array)
668 if n_samples < ensure_min_samples:
--> 669 raise ValueError("Found array with %d sample(s) (shape=%s) while a"
670 " minimum of %d is required%s."
671 % (n_samples, array.shape, ensure_min_samples,

ValueError: Found array with 0 sample(s) (shape=(0, 18)) while a minimum of 1 is required.

Error in Chapter 4 to 6/ 2 Process X_Y Learning Data in the last paragraph

Hi there,

I am really enjoying your book unfortunately there is a mistake in the last paragraph in X[Net Income] = X[Net Income_x] line 22.

And also one in the first of the section (filtering raw data) in line 51 and X has to be x or x in def getYPricesReportDateAndTargetDate(x, d, modifier=365) has to be X

Added filter to filter out rows with negative revenue

I added a filter to remove rows with negative revenue which improved results.

X[(X['Revenue']<0)]# rows with revenue issues

Issue where revenue is negative

bool_list5 = ~((X['Revenue']<0))

y=y[bool_list4 & bool_list5]
X=X[bool_list4 & bool_list5]

In Chapter 2 - Exercise 3, the description of the Exercise does not match the solution

The description of Exercise 3 is:
# Output stocks with a price/earnings ratio below a number and a beta above a number.
# return a list of aceptable stocks with ascending by P/E

The function 'filterStocks' actually differs from the description in two ways:

  • It tests for beta below a number, not above;
  • It doesn't sort the acceptable stocks ascending by P/E

Error in Chapter 4 to 6/ 2 Process X_Y Learning Data fixNansInX function

For some reason when I run this part of the code as stated in the book I get the following error. Any idea why it doesn't work anymore or how to fix it?

def fixNansInX(x):
    '''
    Takes in x DataFrame, edits it so that important keys
    are 0 instead of NaN.
    '''
    keyCheckNullList = ["Short Term Debt" ,\
            "Long Term Debt" ,\
            "Interest Expense, Net",\
            "Income Tax (Expense) Benefit, Net",\
            "Cash, Cash Equivalents & Short Term Investments",\
            "Property, Plant & Equipment, Net",\
            "Revenue",\
            "Gross Profit",\
            "Total Current Liabilities",\
            "Property, Plant & Equipment, Net"]
    x[keyCheckNullList]=x[keyCheckNullList].fillna(0)
[addColsToX(x_)](url)
fixNansInX(x_)
X=getXRatios(x_)
fixXRatios(X)

y=getYPerf(y_)

______________________________________________________________________________________

ValueError                                Traceback (most recent call last)
Input In [134], in <cell line: 3>()
      1 # From x_ (raw fundamental data) get X (stock fundamental ratios)
      2 addColsToX(x_)
----> 3 fixNansInX(x_)
      4 X=getXRatios(x_)
      5 fixXRatios(X)

Input In [133], in fixNansInX(x)
      2 '''
      3 Takes in x DataFrame, edits it so that important keys
      4 are 0 instead of NaN.
      5 '''
      6 keyCheckNullList = ["Short Term Debt" ,\
      7         "Long Term Debt" ,\
      8         "Interest Expense, Net",\
   (...)
     14         "Total Current Liabilities",\
     15         "Property, Plant & Equipment, Net"]
---> 16 x[keyCheckNullList]=x[keyCheckNullList].fillna(0)

File c:\users\danie\appdata\local\programs\python\python39\lib\site-packages\pandas\core\frame.py:3643, in DataFrame.__setitem__(self, key, value)
   3641     self._setitem_frame(key, value)
   3642 elif isinstance(key, (Series, np.ndarray, list, Index)):
-> 3643     self._setitem_array(key, value)
   3644 elif isinstance(value, DataFrame):
   3645     self._set_item_frame_value(key, value)

File c:\users\danie\appdata\local\programs\python\python39\lib\site-packages\pandas\core\frame.py:3687, in DataFrame._setitem_array(self, key, value)
   3685     check_key_length(self.columns, key, value)
   3686     for k1, k2 in zip(key, value.columns):
-> 3687         self[k1] = value[k2]
   3689 elif not is_list_like(value):
   3690     for col in key:

File c:\users\danie\appdata\local\programs\python\python39\lib\site-packages\pandas\core\frame.py:3645, in DataFrame.__setitem__(self, key, value)
   3643     self._setitem_array(key, value)
   3644 elif isinstance(value, DataFrame):
-> 3645     self._set_item_frame_value(key, value)
   3646 elif (
   3647     is_list_like(value)
   3648     and not self.columns.is_unique
   3649     and 1 < len(self.columns.get_indexer_for([key])) == len(value)
   3650 ):
   3651     # Column to set is duplicated
   3652     self._setitem_array([key], value)

File c:\users\danie\appdata\local\programs\python\python39\lib\site-packages\pandas\core\frame.py:3775, in DataFrame._set_item_frame_value(self, key, value)
   3773 len_cols = 1 if is_scalar(cols) else len(cols)
   3774 if len_cols != len(value.columns):
-> 3775     raise ValueError("Columns must be same length as key")
   3777 # align right-hand-side columns if self.columns
   3778 # is multi-index and self[key] is a sub-frame
   3779 if isinstance(self.columns, MultiIndex) and isinstance(
   3780     loc, (slice, Series, np.ndarray, Index)
   3781 ):

ValueError: Columns must be same length as key

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.