Coder Social home page Coder Social logo

build_your_own_ai_investor_2021's Issues

Added filter to filter out rows with negative revenue

I added a filter to remove rows with negative revenue which improved results.

X[(X['Revenue']<0)]# rows with revenue issues

Issue where revenue is negative

bool_list5 = ~((X['Revenue']<0))

y=y[bool_list4 & bool_list5]
X=X[bool_list4 & bool_list5]

Error in Chapter 4 to 6/ 2 Process X_Y Learning Data fixNansInX function

For some reason when I run this part of the code as stated in the book I get the following error. Any idea why it doesn't work anymore or how to fix it?

def fixNansInX(x):
    '''
    Takes in x DataFrame, edits it so that important keys
    are 0 instead of NaN.
    '''
    keyCheckNullList = ["Short Term Debt" ,\
            "Long Term Debt" ,\
            "Interest Expense, Net",\
            "Income Tax (Expense) Benefit, Net",\
            "Cash, Cash Equivalents & Short Term Investments",\
            "Property, Plant & Equipment, Net",\
            "Revenue",\
            "Gross Profit",\
            "Total Current Liabilities",\
            "Property, Plant & Equipment, Net"]
    x[keyCheckNullList]=x[keyCheckNullList].fillna(0)
[addColsToX(x_)](url)
fixNansInX(x_)
X=getXRatios(x_)
fixXRatios(X)

y=getYPerf(y_)

______________________________________________________________________________________

ValueError                                Traceback (most recent call last)
Input In [134], in <cell line: 3>()
      1 # From x_ (raw fundamental data) get X (stock fundamental ratios)
      2 addColsToX(x_)
----> 3 fixNansInX(x_)
      4 X=getXRatios(x_)
      5 fixXRatios(X)

Input In [133], in fixNansInX(x)
      2 '''
      3 Takes in x DataFrame, edits it so that important keys
      4 are 0 instead of NaN.
      5 '''
      6 keyCheckNullList = ["Short Term Debt" ,\
      7         "Long Term Debt" ,\
      8         "Interest Expense, Net",\
   (...)
     14         "Total Current Liabilities",\
     15         "Property, Plant & Equipment, Net"]
---> 16 x[keyCheckNullList]=x[keyCheckNullList].fillna(0)

File c:\users\danie\appdata\local\programs\python\python39\lib\site-packages\pandas\core\frame.py:3643, in DataFrame.__setitem__(self, key, value)
   3641     self._setitem_frame(key, value)
   3642 elif isinstance(key, (Series, np.ndarray, list, Index)):
-> 3643     self._setitem_array(key, value)
   3644 elif isinstance(value, DataFrame):
   3645     self._set_item_frame_value(key, value)

File c:\users\danie\appdata\local\programs\python\python39\lib\site-packages\pandas\core\frame.py:3687, in DataFrame._setitem_array(self, key, value)
   3685     check_key_length(self.columns, key, value)
   3686     for k1, k2 in zip(key, value.columns):
-> 3687         self[k1] = value[k2]
   3689 elif not is_list_like(value):
   3690     for col in key:

File c:\users\danie\appdata\local\programs\python\python39\lib\site-packages\pandas\core\frame.py:3645, in DataFrame.__setitem__(self, key, value)
   3643     self._setitem_array(key, value)
   3644 elif isinstance(value, DataFrame):
-> 3645     self._set_item_frame_value(key, value)
   3646 elif (
   3647     is_list_like(value)
   3648     and not self.columns.is_unique
   3649     and 1 < len(self.columns.get_indexer_for([key])) == len(value)
   3650 ):
   3651     # Column to set is duplicated
   3652     self._setitem_array([key], value)

File c:\users\danie\appdata\local\programs\python\python39\lib\site-packages\pandas\core\frame.py:3775, in DataFrame._set_item_frame_value(self, key, value)
   3773 len_cols = 1 if is_scalar(cols) else len(cols)
   3774 if len_cols != len(value.columns):
-> 3775     raise ValueError("Columns must be same length as key")
   3777 # align right-hand-side columns if self.columns
   3778 # is multi-index and self[key] is a sub-frame
   3779 if isinstance(self.columns, MultiIndex) and isinstance(
   3780     loc, (slice, Series, np.ndarray, Index)
   3781 ):

ValueError: Columns must be same length as key

Error in Chapter 4 to 6/ 2 Process X_Y Learning Data in the last paragraph

Hi there,

I am really enjoying your book unfortunately there is a mistake in the last paragraph in X[Net Income] = X[Net Income_x] line 22.

And also one in the first of the section (filtering raw data) in line 51 and X has to be x or x in def getYPricesReportDateAndTargetDate(x, d, modifier=365) has to be X

In Chapter 2 - Exercise 3, the description of the Exercise does not match the solution

The description of Exercise 3 is:
# Output stocks with a price/earnings ratio below a number and a beta above a number.
# return a list of aceptable stocks with ascending by P/E

The function 'filterStocks' actually differs from the description in two ways:

  • It tests for beta below a number, not above;
  • It doesn't sort the acceptable stocks ascending by P/E

Error in 4_Backtesting_AltmanZ, the 0 sample(s) problem

Hi I try to follow the tut and I meet an issue:

When I ran below function, then the error occurred.
The sample data looks like missing, but I checked 2019's data should be existed.

code:
backTest = getPortTimeSeries(y_withData_Test, X_test,
daily_stock_prices_data,
trained_model_pipeline)

output:
Backtest performance for year starting 2016-12-31 00:00:00 is: 33.03 %
With stocks: ['UPS' 'LII' 'LNTH' 'BA' 'WK' 'INVA' 'NVAX']
UPS Performance was: 11.97 %
LII Performance was: 26.56 %
LNTH Performance was: 67.59 %
BA Performance was: 75.78 %
WK Performance was: 48.8 %
INVA Performance was: 21.69 %
NVAX Performance was: -21.19 %

Backtest performance for year starting 2017-12-31 00:00:00 is: 88.63 %
With stocks: ['LII' 'VSM' 'TLND' 'EGAN' 'TNDM' 'W' 'WNDW']
LII Performance was: 0.88 %
VSM Performance was: -16.76 %
TLND Performance was: -31.48 %
EGAN Performance was: 1.87 %
TNDM Performance was: 725.45 %
W Performance was: 5.47 %
WNDW Performance was: -65.0 %

Backtest performance for year starting 2018-12-31 00:00:00 is: -25.68 %
With stocks: ['ENDP' 'BA' 'DBD' 'CHRS' 'PHUN' 'MNKD' 'WK']
ENDP Performance was: -61.57 %
BA Performance was: -19.2 %
DBD Performance was: 18.67 %
CHRS Performance was: 22.14 %
PHUN Performance was: -97.69 %
MNKD Performance was: -29.95 %
WK Performance was: -12.19 %


ValueError Traceback (most recent call last)
in
----> 1 backTest = getPortTimeSeries(y_withData_Test, X_test,
2 daily_stock_prices_data,
3 trained_model_pipeline)
4
5 print('Performance is: ',

in getPortTimeSeries(y_withData, X, daily_stock_prices, ml_model_pipeline, verbose)
19
20 [comp, this_year_perf, ticker_list] =
---> 21 getPortTimeSeriesForYear(curr_date, y_withData, X,
22 daily_stock_prices, ml_model_pipeline)
23

in getPortTimeSeriesForYear(date_starting, y_withData, X, daily_stock_prices, ml_model_pipeline)
26
27 # Get return prediction from model
---> 28 y_pred = ml_model_pipeline.predict(X[thisYearMask])
29
30 # Make it a DataFrame to select the top picks

C:\ProgramData\Anaconda3\lib\site-packages\sklearn\utils\metaestimators.py in (*args, **kwargs)
118
119 # lambda, but not partial, allows help() to work with update_wrapper
--> 120 out = lambda *args, **kwargs: self.fn(obj, *args, **kwargs)
121 # update the docstring of the returned function
122 update_wrapper(out, self.fn)

C:\ProgramData\Anaconda3\lib\site-packages\sklearn\pipeline.py in predict(self, X, **predict_params)
416 Xt = X
417 for _, name, transform in self._iter(with_final=False):
--> 418 Xt = transform.transform(Xt)
419 return self.steps[-1][-1].predict(Xt, **predict_params)
420

C:\ProgramData\Anaconda3\lib\site-packages\sklearn\preprocessing_data.py in transform(self, X)
3094 """
3095 check_is_fitted(self)
-> 3096 X = self._check_input(X, in_fit=False, check_positive=True,
3097 check_shape=True)
3098

C:\ProgramData\Anaconda3\lib\site-packages\sklearn\preprocessing_data.py in _check_input(self, X, in_fit, check_positive, check_shape, check_method)
3268 If True, check that the transformation method is valid.
3269 """
-> 3270 X = self._validate_data(X, ensure_2d=True, dtype=FLOAT_DTYPES,
3271 copy=self.copy, force_all_finite='allow-nan',
3272 reset=in_fit)

C:\ProgramData\Anaconda3\lib\site-packages\sklearn\base.py in _validate_data(self, X, y, reset, validate_separately, **check_params)
419 out = X
420 elif isinstance(y, str) and y == 'no_validation':
--> 421 X = check_array(X, **check_params)
422 out = X
423 else:

C:\ProgramData\Anaconda3\lib\site-packages\sklearn\utils\validation.py in inner_f(*args, **kwargs)
61 extra_args = len(args) - len(all_args)
62 if extra_args <= 0:
---> 63 return f(*args, **kwargs)
64
65 # extra_args > 0

C:\ProgramData\Anaconda3\lib\site-packages\sklearn\utils\validation.py in check_array(array, accept_sparse, accept_large_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, ensure_min_samples, ensure_min_features, estimator)
667 n_samples = _num_samples(array)
668 if n_samples < ensure_min_samples:
--> 669 raise ValueError("Found array with %d sample(s) (shape=%s) while a"
670 " minimum of %d is required%s."
671 % (n_samples, array.shape, ensure_min_samples,

ValueError: Found array with 0 sample(s) (shape=(0, 18)) while a minimum of 1 is required.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.