Comments (12)
Just one thing to note. It's doesn't really impact me as I'm using a small dataset but when I use the KerasClassifier in the ensemble I can only get it to work with setting n_jobs=1. If I set to -1 it won't work.
from mlens.
Hi!
Looks like you can't make deep copies of Keras models. I took a look at the code and the use of deepcopy
is excessively protective. It's a legacy product of a previous version. I've pushed a branch into PR #98 that solves your issue. Would be great if you could give it a try and check that it indeed solves your issue!
from mlens.
Hi and thanks for the prompt feedback. Fantastic package and looking forward to seeing it develop!
Ok, so this did seem to work, however, if I used KerasClassifier as the meta estimator I get the same error during the fit stage:
`---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
in ()
16 ensemble.add_meta(KerasClassifier(build_fn=create_model2, epochs=20, batch_size=5),proba=True)
17
---> 18 ensemble.fit(X[:294], y[:294],)
19
20
D:\Continuum\anaconda3\lib\site-packages\mlens\ensemble\base.py in fit(self, X, y, **kwargs)
514 self._id_train.fit(X)
515
--> 516 out = self._backend.fit(X, y, **kwargs)
517 if out is not self._backend:
518 # fit_transform
D:\Continuum\anaconda3\lib\site-packages\mlens\ensemble\base.py in fit(self, X, y, **kwargs)
156 with ParallelProcessing(self.backend, self.n_jobs,
157 max(self.verbose - 4, 0)) as manager:
--> 158 out = manager.stack(self, 'fit', X, y, **kwargs)
159
160 if self.verbose:
D:\Continuum\anaconda3\lib\site-packages\mlens\parallel\backend.py in stack(self, caller, job, X, y, path, return_preds, warm_start, split, **kwargs)
671 job=job, X=X, y=y, path=path, warm_start=warm_start,
672 return_preds=return_preds, split=split, stack=True)
--> 673 return self.process(caller=caller, out=out, **kwargs)
674
675 def process(self, caller, out, **kwargs):
D:\Continuum\anaconda3\lib\site-packages\mlens\parallel\backend.py in process(self, caller, out, **kwargs)
716 self.job.clear()
717
--> 718 self._partial_process(task, parallel, **kwargs)
719
720 if task.name in return_names:
D:\Continuum\anaconda3\lib\site-packages\mlens\parallel\backend.py in _partial_process(self, task, parallel, **kwargs)
737 self._gen_prediction_array(task, self.job.job, self.threading)
738
--> 739 task(self.job.args(**kwargs), parallel=parallel)
740
741 if not task.no_output and getattr(task, 'n_feature_prop', 0):
D:\Continuum\anaconda3\lib\site-packages\mlens\parallel\layer.py in call(self, args, parallel)
157
158 if job == 'fit':
--> 159 self.collect()
160
161 if self.verbose:
D:\Continuum\anaconda3\lib\site-packages\mlens\parallel\layer.py in collect(self, path)
169 transformer.collect(path)
170 for learner in self.learners:
--> 171 learner.collect(path)
172
173 def set_output_columns(self, X, y, job, n_left_concats=0):
D:\Continuum\anaconda3\lib\site-packages\mlens\parallel\learner.py in collect(self, path)
652 learner_data,
653 sublearner_files,
--> 654 sublearner_data) = self._collect(path)
655
656 self.clear()
D:\Continuum\anaconda3\lib\site-packages\mlens\parallel\learner.py in _collect(self, path)
709 if self.only_all:
710 # Sub learners are the same as the sub-learners
--> 711 sublearner_files, sublearner_data = replace(learner_files)
712
713 return learner_files, learner_data, sublearner_files, sublearner_data
D:\Continuum\anaconda3\lib\site-packages\mlens\parallel_base_functions.py in replace(source_files)
67 def replace(source_files):
68 """Utility function to replace empty files list"""
---> 69 replace_files = [deepcopy(o) for o in source_files]
70 for o in replace_files:
71 o.name = o.name[:-1] + '0'
D:\Continuum\anaconda3\lib\site-packages\mlens\parallel_base_functions.py in (.0)
67 def replace(source_files):
68 """Utility function to replace empty files list"""
---> 69 replace_files = [deepcopy(o) for o in source_files]
70 for o in replace_files:
71 o.name = o.name[:-1] + '0'
D:\Continuum\anaconda3\lib\copy.py in deepcopy(x, memo, _nil)
178 y = x
179 else:
--> 180 y = _reconstruct(x, memo, *rv)
181
182 # If is its own copy, don't memoize.
D:\Continuum\anaconda3\lib\copy.py in _reconstruct(x, memo, func, args, state, listiter, dictiter, deepcopy)
278 if state is not None:
279 if deep:
--> 280 state = deepcopy(state, memo)
281 if hasattr(y, 'setstate'):
282 y.setstate(state)
D:\Continuum\anaconda3\lib\copy.py in deepcopy(x, memo, _nil)
148 copier = _deepcopy_dispatch.get(cls)
149 if copier:
--> 150 y = copier(x, memo)
151 else:
152 try:
D:\Continuum\anaconda3\lib\copy.py in _deepcopy_tuple(x, memo, deepcopy)
218
219 def _deepcopy_tuple(x, memo, deepcopy=deepcopy):
--> 220 y = [deepcopy(a, memo) for a in x]
221 # We're not going to put the tuple in the memo, but it's still important we
222 # check for it, in case the tuple contains recursive mutable structures.
D:\Continuum\anaconda3\lib\copy.py in (.0)
218
219 def _deepcopy_tuple(x, memo, deepcopy=deepcopy):
--> 220 y = [deepcopy(a, memo) for a in x]
221 # We're not going to put the tuple in the memo, but it's still important we
222 # check for it, in case the tuple contains recursive mutable structures.
D:\Continuum\anaconda3\lib\copy.py in deepcopy(x, memo, _nil)
178 y = x
179 else:
--> 180 y = _reconstruct(x, memo, *rv)
181
182 # If is its own copy, don't memoize.
D:\Continuum\anaconda3\lib\copy.py in _reconstruct(x, memo, func, args, state, listiter, dictiter, deepcopy)
278 if state is not None:
279 if deep:
--> 280 state = deepcopy(state, memo)
281 if hasattr(y, 'setstate'):
282 y.setstate(state)
D:\Continuum\anaconda3\lib\copy.py in deepcopy(x, memo, _nil)
148 copier = _deepcopy_dispatch.get(cls)
149 if copier:
--> 150 y = copier(x, memo)
151 else:
152 try:
D:\Continuum\anaconda3\lib\copy.py in _deepcopy_dict(x, memo, deepcopy)
238 memo[id(x)] = y
239 for key, value in x.items():
--> 240 y[deepcopy(key, memo)] = deepcopy(value, memo)
241 return y
242 d[dict] = _deepcopy_dict
D:\Continuum\anaconda3\lib\copy.py in deepcopy(x, memo, _nil)
178 y = x
179 else:
--> 180 y = _reconstruct(x, memo, *rv)
181
182 # If is its own copy, don't memoize.
D:\Continuum\anaconda3\lib\copy.py in _reconstruct(x, memo, func, args, state, listiter, dictiter, deepcopy)
278 if state is not None:
279 if deep:
--> 280 state = deepcopy(state, memo)
281 if hasattr(y, 'setstate'):
282 y.setstate(state)
D:\Continuum\anaconda3\lib\copy.py in deepcopy(x, memo, _nil)
148 copier = _deepcopy_dispatch.get(cls)
149 if copier:
--> 150 y = copier(x, memo)
151 else:
152 try:
D:\Continuum\anaconda3\lib\copy.py in _deepcopy_dict(x, memo, deepcopy)
238 memo[id(x)] = y
239 for key, value in x.items():
--> 240 y[deepcopy(key, memo)] = deepcopy(value, memo)
241 return y
242 d[dict] = _deepcopy_dict
D:\Continuum\anaconda3\lib\copy.py in deepcopy(x, memo, _nil)
178 y = x
179 else:
--> 180 y = _reconstruct(x, memo, *rv)
181
182 # If is its own copy, don't memoize.
D:\Continuum\anaconda3\lib\copy.py in _reconstruct(x, memo, func, args, state, listiter, dictiter, deepcopy)
278 if state is not None:
279 if deep:
--> 280 state = deepcopy(state, memo)
281 if hasattr(y, 'setstate'):
282 y.setstate(state)
D:\Continuum\anaconda3\lib\copy.py in deepcopy(x, memo, _nil)
148 copier = _deepcopy_dispatch.get(cls)
149 if copier:
--> 150 y = copier(x, memo)
151 else:
152 try:
D:\Continuum\anaconda3\lib\copy.py in _deepcopy_dict(x, memo, deepcopy)
238 memo[id(x)] = y
239 for key, value in x.items():
--> 240 y[deepcopy(key, memo)] = deepcopy(value, memo)
241 return y
242 d[dict] = _deepcopy_dict
D:\Continuum\anaconda3\lib\copy.py in deepcopy(x, memo, _nil)
148 copier = _deepcopy_dispatch.get(cls)
149 if copier:
--> 150 y = copier(x, memo)
151 else:
152 try:
D:\Continuum\anaconda3\lib\copy.py in _deepcopy_list(x, memo, deepcopy)
213 append = y.append
214 for a in x:
--> 215 append(deepcopy(a, memo))
216 return y
217 d[list] = _deepcopy_list
D:\Continuum\anaconda3\lib\copy.py in deepcopy(x, memo, _nil)
178 y = x
179 else:
--> 180 y = _reconstruct(x, memo, *rv)
181
182 # If is its own copy, don't memoize.
D:\Continuum\anaconda3\lib\copy.py in _reconstruct(x, memo, func, args, state, listiter, dictiter, deepcopy)
278 if state is not None:
279 if deep:
--> 280 state = deepcopy(state, memo)
281 if hasattr(y, 'setstate'):
282 y.setstate(state)
D:\Continuum\anaconda3\lib\copy.py in deepcopy(x, memo, _nil)
148 copier = _deepcopy_dispatch.get(cls)
149 if copier:
--> 150 y = copier(x, memo)
151 else:
152 try:
D:\Continuum\anaconda3\lib\copy.py in _deepcopy_dict(x, memo, deepcopy)
238 memo[id(x)] = y
239 for key, value in x.items():
--> 240 y[deepcopy(key, memo)] = deepcopy(value, memo)
241 return y
242 d[dict] = _deepcopy_dict
D:\Continuum\anaconda3\lib\copy.py in deepcopy(x, memo, _nil)
178 y = x
179 else:
--> 180 y = _reconstruct(x, memo, *rv)
181
182 # If is its own copy, don't memoize.
D:\Continuum\anaconda3\lib\copy.py in _reconstruct(x, memo, func, args, state, listiter, dictiter, deepcopy)
278 if state is not None:
279 if deep:
--> 280 state = deepcopy(state, memo)
281 if hasattr(y, 'setstate'):
282 y.setstate(state)
D:\Continuum\anaconda3\lib\copy.py in deepcopy(x, memo, _nil)
148 copier = _deepcopy_dispatch.get(cls)
149 if copier:
--> 150 y = copier(x, memo)
151 else:
152 try:
D:\Continuum\anaconda3\lib\copy.py in _deepcopy_dict(x, memo, deepcopy)
238 memo[id(x)] = y
239 for key, value in x.items():
--> 240 y[deepcopy(key, memo)] = deepcopy(value, memo)
241 return y
242 d[dict] = _deepcopy_dict
D:\Continuum\anaconda3\lib\copy.py in deepcopy(x, memo, _nil)
178 y = x
179 else:
--> 180 y = _reconstruct(x, memo, *rv)
181
182 # If is its own copy, don't memoize.
D:\Continuum\anaconda3\lib\copy.py in _reconstruct(x, memo, func, args, state, listiter, dictiter, deepcopy)
278 if state is not None:
279 if deep:
--> 280 state = deepcopy(state, memo)
281 if hasattr(y, 'setstate'):
282 y.setstate(state)
D:\Continuum\anaconda3\lib\copy.py in deepcopy(x, memo, _nil)
148 copier = _deepcopy_dispatch.get(cls)
149 if copier:
--> 150 y = copier(x, memo)
151 else:
152 try:
D:\Continuum\anaconda3\lib\copy.py in _deepcopy_dict(x, memo, deepcopy)
238 memo[id(x)] = y
239 for key, value in x.items():
--> 240 y[deepcopy(key, memo)] = deepcopy(value, memo)
241 return y
242 d[dict] = _deepcopy_dict
D:\Continuum\anaconda3\lib\copy.py in deepcopy(x, memo, _nil)
167 reductor = getattr(x, "reduce_ex", None)
168 if reductor:
--> 169 rv = reductor(4)
170 else:
171 reductor = getattr(x, "reduce", None)
TypeError: can't pickle _thread.RLock objects`
from mlens.
Oh, I missed a deepcopy
instance used on the meta learner. I've fixed this in the PR branch, would you mind giving it another try?
And thanks, glad you like it! :)
from mlens.
Still getting an error message after making that change. See below
`Epoch 20/20
294/294 [==============================] - 0s 146us/step - loss: 0.5685
TypeError Traceback (most recent call last)
in ()
16 ensemble.add_meta(KerasClassifier(build_fn=create_model2, epochs=20, batch_size=5),proba=True)
17
---> 18 ensemble.fit(X[:294], y[:294],)
19
20
D:\Continuum\anaconda3\lib\site-packages\mlens\ensemble\base.py in fit(self, X, y, **kwargs)
514 self._id_train.fit(X)
515
--> 516 out = self._backend.fit(X, y, **kwargs)
517 if out is not self._backend:
518 # fit_transform
D:\Continuum\anaconda3\lib\site-packages\mlens\ensemble\base.py in fit(self, X, y, **kwargs)
156 with ParallelProcessing(self.backend, self.n_jobs,
157 max(self.verbose - 4, 0)) as manager:
--> 158 out = manager.stack(self, 'fit', X, y, **kwargs)
159
160 if self.verbose:
D:\Continuum\anaconda3\lib\site-packages\mlens\parallel\backend.py in stack(self, caller, job, X, y, path, return_preds, warm_start, split, **kwargs)
671 job=job, X=X, y=y, path=path, warm_start=warm_start,
672 return_preds=return_preds, split=split, stack=True)
--> 673 return self.process(caller=caller, out=out, **kwargs)
674
675 def process(self, caller, out, **kwargs):
D:\Continuum\anaconda3\lib\site-packages\mlens\parallel\backend.py in process(self, caller, out, **kwargs)
716 self.job.clear()
717
--> 718 self._partial_process(task, parallel, **kwargs)
719
720 if task.name in return_names:
D:\Continuum\anaconda3\lib\site-packages\mlens\parallel\backend.py in _partial_process(self, task, parallel, **kwargs)
737 self._gen_prediction_array(task, self.job.job, self.threading)
738
--> 739 task(self.job.args(**kwargs), parallel=parallel)
740
741 if not task.no_output and getattr(task, 'n_feature_prop', 0):
D:\Continuum\anaconda3\lib\site-packages\mlens\parallel\layer.py in call(self, args, parallel)
157
158 if job == 'fit':
--> 159 self.collect()
160
161 if self.verbose:
D:\Continuum\anaconda3\lib\site-packages\mlens\parallel\layer.py in collect(self, path)
169 transformer.collect(path)
170 for learner in self.learners:
--> 171 learner.collect(path)
172
173 def set_output_columns(self, X, y, job, n_left_concats=0):
D:\Continuum\anaconda3\lib\site-packages\mlens\parallel\learner.py in collect(self, path)
652 learner_data,
653 sublearner_files,
--> 654 sublearner_data) = self._collect(path)
655
656 self.clear()
D:\Continuum\anaconda3\lib\site-packages\mlens\parallel\learner.py in _collect(self, path)
709 if self.only_all:
710 # Sub learners are the same as the sub-learners
--> 711 sublearner_files, sublearner_data = replace(learner_files)
712
713 return learner_files, learner_data, sublearner_files, sublearner_data
D:\Continuum\anaconda3\lib\site-packages\mlens\parallel_base_functions.py in replace(source_files)
67 def replace(source_files):
68 """Utility function to replace empty files list"""
---> 69 # replace_files = [deepcopy(o) for o in source_files]
70 replace_files = source_files.copy()
71 for o in replace_files:
D:\Continuum\anaconda3\lib\site-packages\mlens\parallel_base_functions.py in (.0)
67 def replace(source_files):
68 """Utility function to replace empty files list"""
---> 69 # replace_files = [deepcopy(o) for o in source_files]
70 replace_files = source_files.copy()
71 for o in replace_files:
D:\Continuum\anaconda3\lib\copy.py in deepcopy(x, memo, _nil)
178 y = x
179 else:
--> 180 y = _reconstruct(x, memo, *rv)
181
182 # If is its own copy, don't memoize.
D:\Continuum\anaconda3\lib\copy.py in _reconstruct(x, memo, func, args, state, listiter, dictiter, deepcopy)
278 if state is not None:
279 if deep:
--> 280 state = deepcopy(state, memo)
281 if hasattr(y, 'setstate'):
282 y.setstate(state)
D:\Continuum\anaconda3\lib\copy.py in deepcopy(x, memo, _nil)
148 copier = _deepcopy_dispatch.get(cls)
149 if copier:
--> 150 y = copier(x, memo)
151 else:
152 try:
D:\Continuum\anaconda3\lib\copy.py in _deepcopy_tuple(x, memo, deepcopy)
218
219 def _deepcopy_tuple(x, memo, deepcopy=deepcopy):
--> 220 y = [deepcopy(a, memo) for a in x]
221 # We're not going to put the tuple in the memo, but it's still important we
222 # check for it, in case the tuple contains recursive mutable structures.
D:\Continuum\anaconda3\lib\copy.py in (.0)
218
219 def _deepcopy_tuple(x, memo, deepcopy=deepcopy):
--> 220 y = [deepcopy(a, memo) for a in x]
221 # We're not going to put the tuple in the memo, but it's still important we
222 # check for it, in case the tuple contains recursive mutable structures.
D:\Continuum\anaconda3\lib\copy.py in deepcopy(x, memo, _nil)
178 y = x
179 else:
--> 180 y = _reconstruct(x, memo, *rv)
181
182 # If is its own copy, don't memoize.
D:\Continuum\anaconda3\lib\copy.py in _reconstruct(x, memo, func, args, state, listiter, dictiter, deepcopy)
278 if state is not None:
279 if deep:
--> 280 state = deepcopy(state, memo)
281 if hasattr(y, 'setstate'):
282 y.setstate(state)
D:\Continuum\anaconda3\lib\copy.py in deepcopy(x, memo, _nil)
148 copier = _deepcopy_dispatch.get(cls)
149 if copier:
--> 150 y = copier(x, memo)
151 else:
152 try:
D:\Continuum\anaconda3\lib\copy.py in _deepcopy_dict(x, memo, deepcopy)
238 memo[id(x)] = y
239 for key, value in x.items():
--> 240 y[deepcopy(key, memo)] = deepcopy(value, memo)
241 return y
242 d[dict] = _deepcopy_dict
D:\Continuum\anaconda3\lib\copy.py in deepcopy(x, memo, _nil)
178 y = x
179 else:
--> 180 y = _reconstruct(x, memo, *rv)
181
182 # If is its own copy, don't memoize.
D:\Continuum\anaconda3\lib\copy.py in _reconstruct(x, memo, func, args, state, listiter, dictiter, deepcopy)
278 if state is not None:
279 if deep:
--> 280 state = deepcopy(state, memo)
281 if hasattr(y, 'setstate'):
282 y.setstate(state)
D:\Continuum\anaconda3\lib\copy.py in deepcopy(x, memo, _nil)
148 copier = _deepcopy_dispatch.get(cls)
149 if copier:
--> 150 y = copier(x, memo)
151 else:
152 try:
D:\Continuum\anaconda3\lib\copy.py in _deepcopy_dict(x, memo, deepcopy)
238 memo[id(x)] = y
239 for key, value in x.items():
--> 240 y[deepcopy(key, memo)] = deepcopy(value, memo)
241 return y
242 d[dict] = _deepcopy_dict
D:\Continuum\anaconda3\lib\copy.py in deepcopy(x, memo, _nil)
178 y = x
179 else:
--> 180 y = _reconstruct(x, memo, *rv)
181
182 # If is its own copy, don't memoize.
D:\Continuum\anaconda3\lib\copy.py in _reconstruct(x, memo, func, args, state, listiter, dictiter, deepcopy)
278 if state is not None:
279 if deep:
--> 280 state = deepcopy(state, memo)
281 if hasattr(y, 'setstate'):
282 y.setstate(state)
D:\Continuum\anaconda3\lib\copy.py in deepcopy(x, memo, _nil)
148 copier = _deepcopy_dispatch.get(cls)
149 if copier:
--> 150 y = copier(x, memo)
151 else:
152 try:
D:\Continuum\anaconda3\lib\copy.py in _deepcopy_dict(x, memo, deepcopy)
238 memo[id(x)] = y
239 for key, value in x.items():
--> 240 y[deepcopy(key, memo)] = deepcopy(value, memo)
241 return y
242 d[dict] = _deepcopy_dict
D:\Continuum\anaconda3\lib\copy.py in deepcopy(x, memo, _nil)
148 copier = _deepcopy_dispatch.get(cls)
149 if copier:
--> 150 y = copier(x, memo)
151 else:
152 try:
D:\Continuum\anaconda3\lib\copy.py in _deepcopy_list(x, memo, deepcopy)
213 append = y.append
214 for a in x:
--> 215 append(deepcopy(a, memo))
216 return y
217 d[list] = _deepcopy_list
D:\Continuum\anaconda3\lib\copy.py in deepcopy(x, memo, _nil)
178 y = x
179 else:
--> 180 y = _reconstruct(x, memo, *rv)
181
182 # If is its own copy, don't memoize.
D:\Continuum\anaconda3\lib\copy.py in _reconstruct(x, memo, func, args, state, listiter, dictiter, deepcopy)
278 if state is not None:
279 if deep:
--> 280 state = deepcopy(state, memo)
281 if hasattr(y, 'setstate'):
282 y.setstate(state)
D:\Continuum\anaconda3\lib\copy.py in deepcopy(x, memo, _nil)
148 copier = _deepcopy_dispatch.get(cls)
149 if copier:
--> 150 y = copier(x, memo)
151 else:
152 try:
D:\Continuum\anaconda3\lib\copy.py in _deepcopy_dict(x, memo, deepcopy)
238 memo[id(x)] = y
239 for key, value in x.items():
--> 240 y[deepcopy(key, memo)] = deepcopy(value, memo)
241 return y
242 d[dict] = _deepcopy_dict
D:\Continuum\anaconda3\lib\copy.py in deepcopy(x, memo, _nil)
178 y = x
179 else:
--> 180 y = _reconstruct(x, memo, *rv)
181
182 # If is its own copy, don't memoize.
D:\Continuum\anaconda3\lib\copy.py in _reconstruct(x, memo, func, args, state, listiter, dictiter, deepcopy)
278 if state is not None:
279 if deep:
--> 280 state = deepcopy(state, memo)
281 if hasattr(y, 'setstate'):
282 y.setstate(state)
D:\Continuum\anaconda3\lib\copy.py in deepcopy(x, memo, _nil)
148 copier = _deepcopy_dispatch.get(cls)
149 if copier:
--> 150 y = copier(x, memo)
151 else:
152 try:
D:\Continuum\anaconda3\lib\copy.py in _deepcopy_dict(x, memo, deepcopy)
238 memo[id(x)] = y
239 for key, value in x.items():
--> 240 y[deepcopy(key, memo)] = deepcopy(value, memo)
241 return y
242 d[dict] = _deepcopy_dict
D:\Continuum\anaconda3\lib\copy.py in deepcopy(x, memo, _nil)
178 y = x
179 else:
--> 180 y = _reconstruct(x, memo, *rv)
181
182 # If is its own copy, don't memoize.
D:\Continuum\anaconda3\lib\copy.py in _reconstruct(x, memo, func, args, state, listiter, dictiter, deepcopy)
278 if state is not None:
279 if deep:
--> 280 state = deepcopy(state, memo)
281 if hasattr(y, 'setstate'):
282 y.setstate(state)
D:\Continuum\anaconda3\lib\copy.py in deepcopy(x, memo, _nil)
148 copier = _deepcopy_dispatch.get(cls)
149 if copier:
--> 150 y = copier(x, memo)
151 else:
152 try:
D:\Continuum\anaconda3\lib\copy.py in _deepcopy_dict(x, memo, deepcopy)
238 memo[id(x)] = y
239 for key, value in x.items():
--> 240 y[deepcopy(key, memo)] = deepcopy(value, memo)
241 return y
242 d[dict] = _deepcopy_dict
D:\Continuum\anaconda3\lib\copy.py in deepcopy(x, memo, _nil)
167 reductor = getattr(x, "reduce_ex", None)
168 if reductor:
--> 169 rv = reductor(4)
170 else:
171 reductor = getattr(x, "reduce", None)
TypeError: can't pickle _thread.RLock objects`
from mlens.
It looks like you haven't successfully installed the PR #98 branch of mlens
: it's the same line that's causing the error, but it is now fixed in the PR.
The key line in your traceback is 69 in the snippet below:
D:\Continuum\anaconda3\lib\site-packages\mlens\parallel_base_functions.py in replace(source_files)
67 def replace(source_files):
68 """Utility function to replace empty files list"""
---> 69 # replace_files = [deepcopy(o) for o in source_files]
70 replace_files = source_files.copy()
71 for o in replace_files:
To get this is crazy because 69 appears to be commented out and should not be executed. But more to the point, in PR #98, this line does not even exist. In fact, the file the traceback is pointing to
(mlens\parallel_base_functions.py
) doesn't exist in v2
at all.
It looks like you're not running the actual PR #98, but on something else? If you have your own version, you can just remove the offending line. To install the PR, run
pip uninstall mlens;
git clone https://github.com/flennerhag/mlens; cd mlens;
git fetch;
git checkout deepcopy;
pip install .
from mlens.
Great thanks that works!
from mlens.
That's likely because you are using multithreading / multiprocessing in the KerasClassifier (which was why you got the first error). Would you mind checking a few things:
-
first, can you turn multithreading / processing off in the classifier?
-
second, does it fail under both
backend=threading
andbackend=multiprocessing
? -
third, if it fails under
backend=multiprocessing
, could you try using different start methods: i.e. at the top of your script, after the imports, addmlens.config.set_start_method(method)
where method is one of'fork', 'spawn', 'forkserver'
.
from mlens.
That's likely because you are using multithreading / multiprocessing in the KerasClassifier (which was why you got the first error). Would you mind checking a few things:
first, can you turn multithreading / processing off in the classifier?
second, does it fail under both backend=threading and backend=multiprocessing?
third, if it fails under backend=multiprocessing, could you try using different start methods: i.e. at the top of your script, after the imports, add mlens.config.set_start_method(method) where method is one of 'fork', 'spawn', 'forkserver'.
I had looked into whether Keras can be forced into single threaded mode. I had found some code to do that but it did not work in my case.
keras-team/keras#4740
tensorflow/tensorflow#11066
This one goes into some issues with Keras and joblib
keras-team/keras#3181
I had been using a tensorflow backend. I was able to solve the problem by switching to the CNTK backend (I've not tried Theano).
To answer your other questions, prior to switching to the CNTK back end I tried your multiprocessing suggestion.
In the first instance multiprocessing just seemed to hangs in the preprocessing phase. I tried fork and forkserver but that did not help. I checked my task manager processes and there were some additional processes in the background but they seemed to be using very little memory or CPU power.
I tried the "spawn" option next. Now that did get me past the preprocessing phase but it hung when it got to the KerasClassifier model.
I should mention that I was able to use KerasClassifier with the Tensorflow backend in one circumstance wheere n_jobs = -1. It only worked if it was the only model in a layer. So I could have a KerasClassifier model as a sole layer and it would run just fine but once other models were in that layer that's where it failed.
Anyway, I think the short term solution for someone with this same issue is to try switching the Keras backend as the Tensorflow backend creates issues.
This is my script by the way. This was just a simple example I was testing this package with to get a feel for how it works using the Pima Indians data.
from sklearn.ensemble import GradientBoostingClassifier
from sklearn.preprocessing import MinMaxScaler
import pandas as pd
import numpy as np
from mlens.ensemble import SuperLearner
from sklearn.linear_model import LogisticRegression
from keras.wrappers.scikit_learn import KerasClassifier
from keras.models import Sequential
from keras.layers import Dense
import mlens
# In[2]:
mlens.config.set_start_method('spawn')
# In[3]:
data = pd.read_csv('d:/PimaIndians.csv')
X = data.drop('test', axis=1).values
y = data['test'].values
seed = 2017
np.random.seed(seed)
def create_model(optimizer='adam', init='uniform'):
# create model
model = Sequential()
model.add(Dense(16, input_dim=4, kernel_initializer=init, activation='relu'))
model.add(Dense(8, kernel_initializer=init, activation='relu'))
model.add(Dense(1, kernel_initializer=init, activation='sigmoid'))
# Compile model
model.compile(loss='binary_crossentropy', optimizer=optimizer)
return model
ensemble = SuperLearner(random_state=seed, verbose=10, n_jobs=-1, backend='multiprocessing')
# Build the first layer
ensemble.add([GradientBoostingClassifier(random_state=seed), LogisticRegression()], preprocessing=MinMaxScaler(),
proba=True)
# Build the second layer
ensemble.add([KerasClassifier(build_fn=create_model, epochs=5, batch_size=10), LogisticRegression(penalty='l1'),
LogisticRegression(penalty='l2')], preprocessing=MinMaxScaler(), proba=True)
# Attach the final meta estimator
ensemble.add_meta([LogisticRegression(penalty='l1')], proba=True)
ensemble.fit(X[:294], y[:294], )
By the way, I noticed that when I do include a neural network in the ensemble and I have, for example two models feeding into it in the preceding layer, the expected input dimension is 4. Do you know why that is?
E.g.
Layer1: RandomForest, Logistic Regression
Layer2: Neural Network
I would have expected the input dimension for the neural network to be 2 rather than 4. I'm always using probabilities via the proba keyword. Is it that Layer 1 is sending the predicted class AND the probability downstream or is it that it is sending for a given sample, as an example, 95% and 5%.
Is there a way to see the predictions of the individual layers?
from mlens.
That’s super helpful, thanks! I think TF has their own process manager and two chefs doesn’t make a good soup. Weird that it works when it’s the only model in the layer though, makes me think that it’s the other models that hang because of TF, and not the KerasClassifier.
from mlens.
And as for your other question, the input dim is n_models * n_classes. If you use subsembles, it is additionally multiplied by the number of partitions. It’s a bit redundant to feed in all n_classes, but fixing it is not a priority.
And yes, you can see the predictions of a layer by specifying the return_preds
argument in the fit or predict call. Set it to a list to get multiple layers at a time.
from mlens.
Gotcha, thanks!
from mlens.
Related Issues (20)
- OSError: [Errno 24] Too many open files HOT 1
- Serialize mlens superlearner with KerasRegressor inside HOT 1
- mlen superlearner for MIMO multi-input multi-output HOT 2
- Error when using sklearn StratifiedKFold in Evaluator CV HOT 1
- getting zero score accuracy on test data
- If I already have trained models, how can I use mlens HOT 3
- confirmation
- Save / Restore model HOT 5
- How do I know the weight of the base model assigned by the meta model?
- Adding custom models in the superlearner
- Apply preprocessing to target variable as well
- Monotonic constraints
- Error when using preprocessing per case in model selection HOT 2
- Error involving Collections Module
- Getting error when executing the ensemble.fit(X_train, y_train) command HOT 1
- Prediction failing with 1 row of test data
- why the predict_proba() function do not return the probabilities?
- Error while running ensemble.fit(X_train, y_train)
- Error in index/base.py when using NumPy 1.24 or higher - Replace `np.int` with `np.int_`
- Superlearnerl on google colab (python 3.10 or 3.7) HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from mlens.