lamda-nju / deep-forest Goto Github PK

View Code? Open in Web Editor NEW

904.0 904.0 158.0 448 KB

An Efficient, Scalable and Optimized Python Framework for Deep Forest (2021.2.1)

Home Page: https://deep-forest.readthedocs.io

License: Other

Shell 0.04% Python 53.34% Cython 46.62%

deep-forest ensemble-learning machine-learning python random-forest

deep-forest's Issues

ValueError: Number of features of the model must match the input. Model n_features is 9 and input n_features is 13

It seems that the latest version of deep forest has some bugs in terms of "sklearn" backend. I don't know the exact reason, but it seems that there may exist transformation errors in the deep forest package, which makes the data dimensions unmatched in fitting and prediction steps.

ValueError: Number of features of the model must match the input. Model n_features is 9 and input n_features is 13

Support `CascadeForestRegressor` for univariate and multivariate regression

The forest model excels at the regression problem, and tends to have a much smaller variance than GBDTs. Therefore, it would be nice if deep forest could further support univariate and multivariate regression.

Related issue
#3

Possible steps on this feature request

Include sklearn.RandomForestRegressor to sklearn.ExtraTreesRegressor in estimator.py
Implement a cascade layer for regression in layer.py
Implement the CascadeForestRegressor in cascade.py
Evaluate the performance of CascadeForestRegressor
Add unit tests on all related parts

Feature Requests

This issue collects all features requests. Any one is welcomed to work on issues listed below, and do not forget to include your contributions and name in the CHANGELOG.rst.

If you want to work on a requested feature, please re-open the linked issue, and leave a comment below to let us know that you want to work on it.

New features

CascadeForestRegressor class for regression problem (#4)
export_graphviz method on visualizing decision trees in deep forest (#12)
Export decision tree in DF21 to SHAP (#12)
Internal label encoder on the class labels (#13)
Better support on the input data (#19, #20)
GPU Support (#27)
Support customized estimator in the layers (#29)
CascadeForestSurvAnalyzer class for survival analysis (#71)
Customized evaluation metrics in cascade forest (#98)

Python package

Build wheels for Python 2.7
Set up CI on building wheels for Mac-OS (#6, #32)

New language wrappers:

C/C++ Interface (#9)

Fix

Model fix on extremely small datasets (#103)

Has sub-partition been implemented now?

Hello I am a new user of deep-forest.
I've read the intro slide. Could you tell me that the sub-partition method of Distributed representation learning been implemented or not by now.
Thank you so much.
@xuyxu

Please help me!

mod=model.get_layer_feature_importances(layer_idx=0)
print(mod)

RuntimeError: Please use the sklearn backend to get the feature importances property for each cascade layer.

I want to get layer feature importances, but I donot know how to get it .

如何获取特征变量重要性排序

DF可以像RF一样获得到特征的重要性排序吗

DF内部如何进行特征选取？

请问DF进行特征选取的过程是怎样的

Error: could not allocate 0 bytes

When I was using this package, I experienced the following problem. According to my observation, there is still a lot of available memory. Thus, what's the problem?

  File "deepforest/tree/_tree.pyx", line 123, in deepforest.tree._tree.DepthFirstTreeBuilder.build
  File "deepforest/tree/_tree.pyx", line 256, in deepforest.tree._tree.DepthFirstTreeBuilder.build
  File "deepforest/tree/_tree.pyx", line 480, in deepforest.tree._tree.Tree._resize_node_c
  File "deepforest/tree/_utils.pyx", line 34, in deepforest.tree._utils.safe_realloc
MemoryError: could not allocate 0 bytes

Is there a C/C++ interface? How to deploy to a real-world application scenario?

Could you provide a parameter grid for deep-forest?

I am glad to see that such a method has achieved promising results on many machine learning tasks. However, in real world scenario, we often tune the hyper-parameter of a specific classifier based on cross-validation scheme. Currently, I am working on constructing a machine learning benchmark, and I believe that a proper set of the parameter grid is vital for fairly comparing the performance between different algorithms. Consequently, could you provide a recommended parameter grid for deep-forest? Or, at least provide a guideline for tuning the hyper-parameter of deep forest?

Are there any way to interprete the DF21 model, like graphviz for Decision tree or SHAP for XGB?

completely-random tree forests

I have a question if I want to build a completely-random tree forests should I use ExtraTreesClassifier and how I set max_features?

Thanks for your help

can't install package use conda env

ERROR: Could not find a version that satisfies the requirement deep-forest (from versions: none)
ERROR: No matching distribution found for deep-forest

system: mac
python version: 3.8.5
pip version: 20.2.4

安装 DF21的问题

anaconda中安装，确认所需库的版本号正确，运行例子时出现“ numpy.ufunc size changed, may indicate binary incompatibility. Expected 216 from C header, got 192 from PyObject”，请问这是为什么

Multi-grained scanning part still exist?

Hi. I want to set the size of the sliding window, but I did not find the code of multi-grained scanning part in DF21. Does the function of multi-grained scanning part exist in DF21? If so, where can I find the corresponding file? Also, I wonder, do subcascades of each cascade still exist, e.g., Level 1A, Level 1B, Level 1C. Look forward to your help. Thank you!

gpu support and sklearn support

请问现在可以支持gpu了吗?另外是否支持sklearn的gridSearchCV呢？

怎样动态定义CascadeForestClassifier

我需要进行网格搜索，那么我需要动态定义CascadeForestClassifier，就需要把字典形式的参数组传入CascadeForestClassifier生成，但是CascadeForestClassifier好像没有这方面的函数

is there deep forest regressor?

CascadeForestClassifier for cross validation in sklearn

CascadeForestClassifier class cannot be used in sklearn's cross_val_score directly, maybe we can inherited from BaseEstimator?
Something like this:

from deepforest import CascadeForestClassifier
from sklearn.base import BaseEstimator
from sklearn.datasets import load_breast_cancer
from sklearn.metrics import accuracy_score
from sklearn.model_selection import cross_val_score

X, y = load_breast_cancer(return_X_y=True)


class CFC(CascadeForestClassifier, BaseEstimator):
    def __init__(self, **kwargs):
        super().__init__(**kwargs)

    def score(self, X, y):
        return accuracy_score(y, self.predict(X))


score = cross_val_score(CFC(random_state=10), X, y)
print(score)

About the weakness of deep forest

Is it possible to use FPGA acceleration?

Can CascadeForestClassifier.predict() return the original class labels instead of 0/1?

Trying the basic tutorial:

X, y = load_digits(return_X_y=True)
y += 100 # This is what I changed
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=1)

model = CascadeForestClassifier(random_state=1)

# Train
model.fit(X_train, y_train)

# Evaluate
y_pred = model.predict(X_test)
acc = accuracy_score(y_test, y_pred) * 100

print("Testing Accuracy: {:.3f} %".format(acc))

And get:

Testing Accuracy: 0.000 %

This is because the class labels are made into "101", "102", "103", ..., instead of "1", "2", "3",... .
But the predict() function (or the model itself) could not deal with these labels.

Is it possible to let the CascadeForestClassifier.predict() function use the original class label (e.g., "101" instead of "1")? It is the basic feature of sklearn models. Or it is also fine to explain in the documentation how CascadeForestClassifier maps original class labels into integers. Now it is a bit confusing and not very convenient to use.

BTW, big fan of your work! :))) I've been waiting for it even since it is published on IJCAI.

Multi-Grained Scanning implementation

Is this implementation support Multi grained scanning ?
Note : M.G.S is the first part of Gcforest

take() got an unexpected keyword argument 'axis'

Got error with code:
from sklearn.datasets import load_digits
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

from deepforest import CascadeForestClassifier

model = CascadeForestClassifier(random_state=1)
model.fit(X_train, y_train)

TypeError Traceback (most recent call last)
in
6
7 model = CascadeForestClassifier(random_state=1)
----> 8 model.fit(X_train, y_train.values.ravel())

/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/deepforest/cascade.py in fit(self, X, y, sample_weight)
1395 y = self._encode_class_labels(y)
1396
-> 1397 super().fit(X, y, sample_weight)
1398
1399 def predict_proba(self, X):

/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/deepforest/cascade.py in fit(self, X, y, sample_weight)
754
755 # Bin the training data
--> 756 X_train_ = self.bin_data(binner, X, is_training_data=True)
757 X_train_ = self.buffer_.cache_data(0, X_train_, is_training_data=True)
758

/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/deepforest/cascade.py in _bin_data(self, binner, X, is_training_data)
665 tic = time.time()
666 if is_training_data:
--> 667 X_binned = binner.fit_transform(X)
668 else:
669 X_binned = binner.transform(X)

/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/sklearn/base.py in fit_transform(self, X, y, **fit_params)
697 if y is None:
698 # fit method of arity 1 (unsupervised transformation)
--> 699 return self.fit(X, **fit_params).transform(X)
700 else:
701 # fit method of arity 2 (supervised transformation)

/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/deepforest/_binner.py in fit(self, X)
128 self.validate_params()
129
--> 130 self.bin_thresholds = _find_binning_thresholds(
131 X,
132 self.n_bins - 1,

/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/deepforest/_binner.py in _find_binning_thresholds(X, n_bins, bin_subsample, bin_type, random_state)
75 if n_samples > bin_subsample:
76 subset = rng.choice(np.arange(n_samples), bin_subsample, replace=False)
---> 77 X = X.take(subset, axis=0)
78
79 binning_thresholds = []

TypeError: take() got an unexpected keyword argument 'axis'

Dataset is loaded with vaex, is this a problem particular for vaex?

Call for User Report

As stated in the documentation, the goal of this package is to:

Provide users with an effective and powerful option to traditional tree-based ensemble models such as random forest and gradient boosting decision tree.

In order to prompt the use of deep forest, and make this package progress towards another popular option when you are considering to use tree-based ensemble models, we would like to call for user reports on using deep forest.

We are particularly interested in:

Promising results of deep forest on machine learning competitions from Kaggle, Tianchi, etc.
Novel applications of deep forest in research fields such as bioinformatics, computational finance, etc.

In a future release, we will set up another webpage in our documentation, and your contributions would be posted there. Notice that there is no strict limitation on the form of your contribution, it could be your winning solution on the competition, the link to your published articles, and many more.

Please feel free to comment below or send me an e-mail if you are willing to share your achievements with us. Thanks!

Multi-Grained Scanning

Hi! I am learning deep forest, and I come across a question. That is when we do sliding, say we have 400-dim raw input features, and then we generate 301 instances with size of 100, for the first instance, we train a forest, and then the second instance comes, so the second instance is input into the first forest or used to train another forest? In other words, in multi-grained scanning part, we train 1 forest or 301 forests?

Documentation is unable to access

Document like https://deep-forest.readthedocs.io/en/latest/how_to_get_started.html is not available

[BUG] `CascadeForestRegressor` somehow cannot be inserted into a DataFrame

Describe the bug
CascadeForestRegressor somehow cannot be inserted into a DataFrame

To Reproduce

import pandas as pd
from deepforest import CascadeForestRegressor
from ngboost import NGBRegressor

ngr = NGBRegressor()  # ngboost regressor for example. xgb, lgb should also be no problem.
cfr = CascadeForestRegressor()
df= pd.DataFrame()

# somehow OK
df.insert(0, "ngr", [ngr])
# somehow error
df.insert(0, "cf", [cforest])

Expected behavior
No error

Additional context

ValueError                                Traceback (most recent call last)
<ipython-input-32-ab0139d10254> in <module>
----> 1 df.insert(0, "cf", [cforest])

/mnt/hdd2/lvhao/miniconda3/envs/pycaret/lib/python3.7/site-packages/pandas/core/frame.py in insert(self, loc, column, value, allow_duplicates)
   3760             )
   3761         self._ensure_valid_index(value)
-> 3762         value = self._sanitize_column(column, value, broadcast=False)
   3763         self._mgr.insert(loc, column, value, allow_duplicates=allow_duplicates)
   3764 

/mnt/hdd2/lvhao/miniconda3/envs/pycaret/lib/python3.7/site-packages/pandas/core/frame.py in _sanitize_column(self, key, value, broadcast)
   3900             if not isinstance(value, (np.ndarray, Index)):
   3901                 if isinstance(value, list) and len(value) > 0:
-> 3902                     value = maybe_convert_platform(value)
   3903                 else:
   3904                     value = com.asarray_tuplesafe(value)

/mnt/hdd2/lvhao/miniconda3/envs/pycaret/lib/python3.7/site-packages/pandas/core/dtypes/cast.py in maybe_convert_platform(values)
    110     """ try to do platform conversion, allow ndarray or list here """
    111     if isinstance(values, (list, tuple, range)):
--> 112         values = construct_1d_object_array_from_listlike(values)
    113     if getattr(values, "dtype", None) == np.object_:
    114         if hasattr(values, "_values"):

/mnt/hdd2/lvhao/miniconda3/envs/pycaret/lib/python3.7/site-packages/pandas/core/dtypes/cast.py in construct_1d_object_array_from_listlike(values)
   1636     # making a 1D array that contains list-likes is a bit tricky:
   1637     result = np.empty(len(values), dtype="object")
-> 1638     result[:] = values
   1639     return result
   1640 

/mnt/hdd2/lvhao/miniconda3/envs/pycaret/lib/python3.7/site-packages/deepforest/cascade.py in __getitem__(self, index)
    518 
    519     def __getitem__(self, index):
--> 520         return self._get_layer(index)
    521 
    522     def _get_n_output(self, y):

/mnt/hdd2/lvhao/miniconda3/envs/pycaret/lib/python3.7/site-packages/deepforest/cascade.py in _get_layer(self, layer_idx)
    561             logger.debug("self.n_layers_ = "+ str(self.n_layers_))
    562             logger.debug("layer_idx = "+ str(layer_idx))
--> 563             raise ValueError(msg.format(self.n_layers_ - 1, layer_idx))
    564 
    565         layer_key = "layer_{}".format(layer_idx)

ValueError: The layer index should be in the range [0, 1], but got 2 instead.

This bug can be simpliy fixed if we change if not 0 <= layer_idx < self.n_layers_: to if not 0 <= layer_idx <= self.n_layers_:, but I still don't know the cause of this error and whether this fix is corret.

Is it possible to support the Arm processor?

Currently, there are a lot of devices that are running on Arm processors. Thus, supporting the Arm processor is a good idea to allow the deep forest to have much wider application prospects.

.fit() not working when dataset contains bool variables

I have a dataset with ~500 variables. Some are boolean variables. I had these error when try to fit the model

TypeError: ufunc 'isnan' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''

And when I fit the model without these boolean variables, it worked.

Buffer dtype mismatch

调用数据集训练出现错误：
File "deepforest/_cutils.pyx", line 59, in deepforest._cutils._map_to_bins
File "deepforest/_cutils.pyx", line 76, in deepforest._cutils._map_to_bins
ValueError: Buffer dtype mismatch, expected 'const X_DTYPE_C' but got 'long'

Survival models

Hi maintainer,

I am wondering is that possible to cascade random survival forest (maybe a sksurv model) instead of RF in your deep forest model? As in #48, it seems that the supported model types are classification and regression. (or did I miss some parts of those tutorial docs?)

Thanks.

Set up the pre-commit workflow for automatic formatting on code style

Maybe we can add a formatter like black in the Makefile and gitaction to make life easier for developer.
What do you think?

Can df21 extract feature like cnn when used in object detection?

Or we need to extract feature with hog/sift etc...

A problem :Size of weights must equal to number of rows

When I fit the models, an error raises :
Check failed: weights_.Size() == num_row_ (385683 vs. 308546) : Size of weights must equal to number of rows.
I have checked the source code of kfoldwrapper, and I find that ：

Maybe the “sample_weight” should be “sample_weight[train_idx]" ? Otherwise the shape of sample_weight can not math to that of the X

Custom CascadeForestClassifier

Hey,

Thanks for your awesome repo.

I have a question if you don't mind could you please give me an example on how to change RandomForestClassifier and ExtraTreesClassifier in the CascadeForestClassifier?

Are there any plans to support installation on MacOS?

I found that the MacOS version is not yet available in the pypi repository. I wonder if there are plans to provide a mac version, or is there any way to install it manually?

[BUG] cannot correctly clone `CascadeForestRegressor` with `sklearn.base.clone` when using customized estimators

Describe the bug
cannot correctly clone CascadeForestClassifier/CascadeForestRegressor object with sklearn.base.clone when using customized stimators

To Reproduce

from sklearn.datasets import load_boston
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error
from sklearn.base import clone
from deepforest import CascadeForestRegressor
import xgboost as xgb
import lightgbm as lgb

X, y = load_boston(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=1)
model = CascadeForestRegressor(random_state=1)

# set estimator
n_estimators = 4  # the number of base estimators per cascade layer
estimators = [lgb.LGBMRegressor(random_state=i)  for i in range(n_estimators)]
model.set_estimator(estimators)

# set predictor 
predictor = xgb.XGBRegressor()
model.set_predictor(predictor)

# clone model
model_new = clone(model)

# try to fit
model.fit(X_train, y_train)

Expected behavior
No error

Additional context

~/miniconda3/envs/pycaret/lib/python3.8/site-packages/deep_forest-0.1.5-py3.8-linux-x86_64.egg/deepforest/cascade.py in fit(self, X, y, sample_weight)
   1004                 if not hasattr(self, "predictor_"):
   1005                     msg = "Missing predictor after calling `set_predictor`"
-> 1006                     raise RuntimeError(msg)
   1007 
   1008             binner_ = Binner(

RuntimeError: Missing predictor after calling `set_predictor`

This bug occours because when the model is cloned, if the model has customized predictor or estimators, predictor='custom' will be cloned, while self.predictor_ / self.dummy_estimators will not be correctly cloned, which introduced the bug described above.

I think this bug can be easily fixed by putting the predictor and the list of estimators into the parameter of CascadeForestClassifier/CascadeForestRegressor, just like the way of those meta estimators (e.g. ngboost), but maybe the corresponding APIs will have to be changed.

For example, the API parameters could be:

model = CascadeForestRegressor(
    estimators=[lgb.LGBMRegressor(random_state=i) for i in range(n_estimators)],
    predictor=xgb.XGBRegressor(),
)

【help】params tuning

for my experiment,I set params like this:

parameters = [
    {
        'n_estimators': [2, 5, 8, 10],
        'n_trees': [50, 100, 150, 200, 250, 300],
        'predictors': ['xgboost', 'lightgbm', 'forest'],
        'max_layers': [20, 50, 80, 120, 150],
        'use_predictor': [True]
    },
    {
        'n_estimators': [2, 5, 8, 10, 13],
        'n_trees': [50, 100, 150, 200, 250, 300, 400],
        'max_layers': [20, 50, 80, 120, 150, 200],
    },
]

finally, the experiment shows same result with different predictors when use_predictor is True, and different max_laters can also get same result.

I would like to konw the situation is correct?

CascadeForestRegressor的参数predictor选择lightgbm无效

如果在参数中通过model.set_params(dict)设置use_predcitor =Ture, predictor = 'lightgbm'，实际上在模型中依然predictor为forest

Can this library support for regression problems?

Better Documentation

Thanks to the contributors, many new features have been developed. As a result, the current version of documentation could be ambiguous, and requires more explanation or demonstration.

This issue collects suggestions on the documentation. Any one is welcomed to improve the readability of the documentation. For contributors unfamiliar with our workflow on building the documentation, please refer to the instructions below.

Build the documentation locally

Clone the repo and install package dependencies:

git clone https://github.com/LAMDA-NJU/Deep-Forest.git
cd Deep-Forest/docs
pip install -r requirements.txt

Make modifications to the corresponding .rst file. (Wiki of rst)
Build the documentation:

make html

The generated html files are available in the directory _build/html/, and the homepage is index.html.

Build the documentation via pull request

Readthedocs has been integrated into our CI, and you can also view the documentation after creating your PR, available in the last row of GitHub Checks on your PR page.

All contributors

Full list available at Contributors.

@dwaipayan05: #31

Please !!!! I need implement a Model of Multi Inputs Multi Outputs, but I do not know how to do it.

你好！
请问我想知道DF21的CascadeForestRegressor或者是CascadeForestClassifier支持输出多个特征吗？（eg:多输入=》model=》多维度的输出）

Hello!
I am very glad to find your greatful and useful work, But I want to know how to implement the Model of Multi Inputs Multi Outputs in DF21 . I really need it in my environment. Can you help me ?
Please

Generic cascade structure

Maybe at some point, we can refactor and allows user supplying custom models for the layers

how to train it with large trainset？

hi，I would to know whether deep forest could support minibatch like DL？

Does not understand character buffer dtype format string

I keep having this errors:

Traceback (most recent call last):
File "classification.py", line 18, in
model.fit(X_train, y_train)
File "/afs/crc.nd.edu/user/a/alaguna/Documents/OngoingResearch/Forest/Deep-Forest/deepforest/cascade.py", line 1418, in fit
super().fit(X, y, sample_weight)
File "/afs/crc.nd.edu/user/a/alaguna/Documents/OngoingResearch/Forest/Deep-Forest/deepforest/cascade.py", line 811, in fit
X_train_, y, sample_weight=sample_weight
File "/afs/crc.nd.edu/user/a/alaguna/Documents/OngoingResearch/Forest/Deep-Forest/deepforest/_layer.py", line 222, in fit_transform
sample_weight,
File "/afs/crc.nd.edu/user/a/alaguna/Documents/OngoingResearch/Forest/Deep-Forest/deepforest/_layer.py", line 40, in _build_estimator
X_aug_train = estimator.fit_transform(X, y, sample_weight)
File "/afs/crc.nd.edu/user/a/alaguna/Documents/OngoingResearch/Forest/Deep-Forest/deepforest/estimator.py", line 212, in fit_transform
self.estimator.fit(X, y, sample_weight)
File "/afs/crc.nd.edu/user/a/alaguna/Documents/OngoingResearch/Forest/Deep-Forest/deepforest/forest.py", line 479, in fit
for i, t in enumerate(trees)
File "/afs/crc.nd.edu/user/a/alaguna/.local/lib/python3.7/site-packages/joblib/parallel.py", line 921, in call
if self.dispatch_one_batch(iterator):
File "/afs/crc.nd.edu/user/a/alaguna/.local/lib/python3.7/site-packages/joblib/parallel.py", line 759, in dispatch_one_batch
self._dispatch(tasks)
File "/afs/crc.nd.edu/user/a/alaguna/.local/lib/python3.7/site-packages/joblib/parallel.py", line 716, in _dispatch
job = self._backend.apply_async(batch, callback=cb)
File "/afs/crc.nd.edu/user/a/alaguna/.local/lib/python3.7/site-packages/joblib/_parallel_backends.py", line 182, in apply_async
result = ImmediateResult(func)
File "/afs/crc.nd.edu/user/a/alaguna/.local/lib/python3.7/site-packages/joblib/_parallel_backends.py", line 549, in init
self.results = batch()
File "/afs/crc.nd.edu/user/a/alaguna/.local/lib/python3.7/site-packages/joblib/parallel.py", line 225, in call
for func, args, kwargs in self.items]
File "/afs/crc.nd.edu/user/a/alaguna/.local/lib/python3.7/site-packages/joblib/parallel.py", line 225, in
for func, args, kwargs in self.items]
File "/afs/crc.nd.edu/user/a/alaguna/Documents/OngoingResearch/Forest/Deep-Forest/deepforest/forest.py", line 119, in _parallel_build_trees
tree.random_state, n_samples, n_samples_bootstrap
File "/afs/crc.nd.edu/user/a/alaguna/Documents/OngoingResearch/Forest/Deep-Forest/deepforest/forest.py", line 98, in _generate_sample_mask
sample_mask = _LIB._c_sample_mask(sample_indices, n_samples)
File "deepforest/_cutils.pyx", line 38, in deepforest._cutils._c_sample_mask
cpdef np.ndarray _c_sample_mask(const INT32_t [:] indices,
File "deepforest/_cutils.pyx", line 46, in deepforest._cutils._c_sample_mask
np.ndarray[BOOL, ndim=1] sample_mask = np.zeros((n_samples,),
ValueError: Does not understand character buffer dtype format string ('?')

报错

TypeError: unhashable type: 'slice'
输入的数据是纯数字数据，样本个数与标签匹配，不知道为何会报错。期待并感谢您的回复。

GPU Support

Considering the number of aggregations from each model in each layer, it would be nice to train the models faster

A idea about future: Transfer learning based on Deep forest ?

Is it possible that deep forest-based migration learning will appear in this library in the future ? Because I find that the transfer ability of the model is very important in practical engineering tasks and many academic papers. If deep-forest can also be applied in the field of transfer learning, it will be very competitive compared to neural networks. Thank you very much!

DF的验证集的划分比例？

我们知道DF会自动将训练集划分一部分验证集，请问这个划分的比例是？