Coder Social home page Coder Social logo

Comments (12)

onacrame avatar onacrame commented on May 14, 2024 1

Just one thing to note. It's doesn't really impact me as I'm using a small dataset but when I use the KerasClassifier in the ensemble I can only get it to work with setting n_jobs=1. If I set to -1 it won't work.

from mlens.

flennerhag avatar flennerhag commented on May 14, 2024

Hi!

Looks like you can't make deep copies of Keras models. I took a look at the code and the use of deepcopy is excessively protective. It's a legacy product of a previous version. I've pushed a branch into PR #98 that solves your issue. Would be great if you could give it a try and check that it indeed solves your issue!

from mlens.

onacrame avatar onacrame commented on May 14, 2024

Hi and thanks for the prompt feedback. Fantastic package and looking forward to seeing it develop!

Ok, so this did seem to work, however, if I used KerasClassifier as the meta estimator I get the same error during the fit stage:

`---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
in ()
16 ensemble.add_meta(KerasClassifier(build_fn=create_model2, epochs=20, batch_size=5),proba=True)
17
---> 18 ensemble.fit(X[:294], y[:294],)
19
20

D:\Continuum\anaconda3\lib\site-packages\mlens\ensemble\base.py in fit(self, X, y, **kwargs)
514 self._id_train.fit(X)
515
--> 516 out = self._backend.fit(X, y, **kwargs)
517 if out is not self._backend:
518 # fit_transform

D:\Continuum\anaconda3\lib\site-packages\mlens\ensemble\base.py in fit(self, X, y, **kwargs)
156 with ParallelProcessing(self.backend, self.n_jobs,
157 max(self.verbose - 4, 0)) as manager:
--> 158 out = manager.stack(self, 'fit', X, y, **kwargs)
159
160 if self.verbose:

D:\Continuum\anaconda3\lib\site-packages\mlens\parallel\backend.py in stack(self, caller, job, X, y, path, return_preds, warm_start, split, **kwargs)
671 job=job, X=X, y=y, path=path, warm_start=warm_start,
672 return_preds=return_preds, split=split, stack=True)
--> 673 return self.process(caller=caller, out=out, **kwargs)
674
675 def process(self, caller, out, **kwargs):

D:\Continuum\anaconda3\lib\site-packages\mlens\parallel\backend.py in process(self, caller, out, **kwargs)
716 self.job.clear()
717
--> 718 self._partial_process(task, parallel, **kwargs)
719
720 if task.name in return_names:

D:\Continuum\anaconda3\lib\site-packages\mlens\parallel\backend.py in _partial_process(self, task, parallel, **kwargs)
737 self._gen_prediction_array(task, self.job.job, self.threading)
738
--> 739 task(self.job.args(**kwargs), parallel=parallel)
740
741 if not task.no_output and getattr(task, 'n_feature_prop', 0):

D:\Continuum\anaconda3\lib\site-packages\mlens\parallel\layer.py in call(self, args, parallel)
157
158 if job == 'fit':
--> 159 self.collect()
160
161 if self.verbose:

D:\Continuum\anaconda3\lib\site-packages\mlens\parallel\layer.py in collect(self, path)
169 transformer.collect(path)
170 for learner in self.learners:
--> 171 learner.collect(path)
172
173 def set_output_columns(self, X, y, job, n_left_concats=0):

D:\Continuum\anaconda3\lib\site-packages\mlens\parallel\learner.py in collect(self, path)
652 learner_data,
653 sublearner_files,
--> 654 sublearner_data) = self._collect(path)
655
656 self.clear()

D:\Continuum\anaconda3\lib\site-packages\mlens\parallel\learner.py in _collect(self, path)
709 if self.only_all:
710 # Sub learners are the same as the sub-learners
--> 711 sublearner_files, sublearner_data = replace(learner_files)
712
713 return learner_files, learner_data, sublearner_files, sublearner_data

D:\Continuum\anaconda3\lib\site-packages\mlens\parallel_base_functions.py in replace(source_files)
67 def replace(source_files):
68 """Utility function to replace empty files list"""
---> 69 replace_files = [deepcopy(o) for o in source_files]
70 for o in replace_files:
71 o.name = o.name[:-1] + '0'

D:\Continuum\anaconda3\lib\site-packages\mlens\parallel_base_functions.py in (.0)
67 def replace(source_files):
68 """Utility function to replace empty files list"""
---> 69 replace_files = [deepcopy(o) for o in source_files]
70 for o in replace_files:
71 o.name = o.name[:-1] + '0'

D:\Continuum\anaconda3\lib\copy.py in deepcopy(x, memo, _nil)
178 y = x
179 else:
--> 180 y = _reconstruct(x, memo, *rv)
181
182 # If is its own copy, don't memoize.

D:\Continuum\anaconda3\lib\copy.py in _reconstruct(x, memo, func, args, state, listiter, dictiter, deepcopy)
278 if state is not None:
279 if deep:
--> 280 state = deepcopy(state, memo)
281 if hasattr(y, 'setstate'):
282 y.setstate(state)

D:\Continuum\anaconda3\lib\copy.py in deepcopy(x, memo, _nil)
148 copier = _deepcopy_dispatch.get(cls)
149 if copier:
--> 150 y = copier(x, memo)
151 else:
152 try:

D:\Continuum\anaconda3\lib\copy.py in _deepcopy_tuple(x, memo, deepcopy)
218
219 def _deepcopy_tuple(x, memo, deepcopy=deepcopy):
--> 220 y = [deepcopy(a, memo) for a in x]
221 # We're not going to put the tuple in the memo, but it's still important we
222 # check for it, in case the tuple contains recursive mutable structures.

D:\Continuum\anaconda3\lib\copy.py in (.0)
218
219 def _deepcopy_tuple(x, memo, deepcopy=deepcopy):
--> 220 y = [deepcopy(a, memo) for a in x]
221 # We're not going to put the tuple in the memo, but it's still important we
222 # check for it, in case the tuple contains recursive mutable structures.

D:\Continuum\anaconda3\lib\copy.py in deepcopy(x, memo, _nil)
178 y = x
179 else:
--> 180 y = _reconstruct(x, memo, *rv)
181
182 # If is its own copy, don't memoize.

D:\Continuum\anaconda3\lib\copy.py in _reconstruct(x, memo, func, args, state, listiter, dictiter, deepcopy)
278 if state is not None:
279 if deep:
--> 280 state = deepcopy(state, memo)
281 if hasattr(y, 'setstate'):
282 y.setstate(state)

D:\Continuum\anaconda3\lib\copy.py in deepcopy(x, memo, _nil)
148 copier = _deepcopy_dispatch.get(cls)
149 if copier:
--> 150 y = copier(x, memo)
151 else:
152 try:

D:\Continuum\anaconda3\lib\copy.py in _deepcopy_dict(x, memo, deepcopy)
238 memo[id(x)] = y
239 for key, value in x.items():
--> 240 y[deepcopy(key, memo)] = deepcopy(value, memo)
241 return y
242 d[dict] = _deepcopy_dict

D:\Continuum\anaconda3\lib\copy.py in deepcopy(x, memo, _nil)
178 y = x
179 else:
--> 180 y = _reconstruct(x, memo, *rv)
181
182 # If is its own copy, don't memoize.

D:\Continuum\anaconda3\lib\copy.py in _reconstruct(x, memo, func, args, state, listiter, dictiter, deepcopy)
278 if state is not None:
279 if deep:
--> 280 state = deepcopy(state, memo)
281 if hasattr(y, 'setstate'):
282 y.setstate(state)

D:\Continuum\anaconda3\lib\copy.py in deepcopy(x, memo, _nil)
148 copier = _deepcopy_dispatch.get(cls)
149 if copier:
--> 150 y = copier(x, memo)
151 else:
152 try:

D:\Continuum\anaconda3\lib\copy.py in _deepcopy_dict(x, memo, deepcopy)
238 memo[id(x)] = y
239 for key, value in x.items():
--> 240 y[deepcopy(key, memo)] = deepcopy(value, memo)
241 return y
242 d[dict] = _deepcopy_dict

D:\Continuum\anaconda3\lib\copy.py in deepcopy(x, memo, _nil)
178 y = x
179 else:
--> 180 y = _reconstruct(x, memo, *rv)
181
182 # If is its own copy, don't memoize.

D:\Continuum\anaconda3\lib\copy.py in _reconstruct(x, memo, func, args, state, listiter, dictiter, deepcopy)
278 if state is not None:
279 if deep:
--> 280 state = deepcopy(state, memo)
281 if hasattr(y, 'setstate'):
282 y.setstate(state)

D:\Continuum\anaconda3\lib\copy.py in deepcopy(x, memo, _nil)
148 copier = _deepcopy_dispatch.get(cls)
149 if copier:
--> 150 y = copier(x, memo)
151 else:
152 try:

D:\Continuum\anaconda3\lib\copy.py in _deepcopy_dict(x, memo, deepcopy)
238 memo[id(x)] = y
239 for key, value in x.items():
--> 240 y[deepcopy(key, memo)] = deepcopy(value, memo)
241 return y
242 d[dict] = _deepcopy_dict

D:\Continuum\anaconda3\lib\copy.py in deepcopy(x, memo, _nil)
148 copier = _deepcopy_dispatch.get(cls)
149 if copier:
--> 150 y = copier(x, memo)
151 else:
152 try:

D:\Continuum\anaconda3\lib\copy.py in _deepcopy_list(x, memo, deepcopy)
213 append = y.append
214 for a in x:
--> 215 append(deepcopy(a, memo))
216 return y
217 d[list] = _deepcopy_list

D:\Continuum\anaconda3\lib\copy.py in deepcopy(x, memo, _nil)
178 y = x
179 else:
--> 180 y = _reconstruct(x, memo, *rv)
181
182 # If is its own copy, don't memoize.

D:\Continuum\anaconda3\lib\copy.py in _reconstruct(x, memo, func, args, state, listiter, dictiter, deepcopy)
278 if state is not None:
279 if deep:
--> 280 state = deepcopy(state, memo)
281 if hasattr(y, 'setstate'):
282 y.setstate(state)

D:\Continuum\anaconda3\lib\copy.py in deepcopy(x, memo, _nil)
148 copier = _deepcopy_dispatch.get(cls)
149 if copier:
--> 150 y = copier(x, memo)
151 else:
152 try:

D:\Continuum\anaconda3\lib\copy.py in _deepcopy_dict(x, memo, deepcopy)
238 memo[id(x)] = y
239 for key, value in x.items():
--> 240 y[deepcopy(key, memo)] = deepcopy(value, memo)
241 return y
242 d[dict] = _deepcopy_dict

D:\Continuum\anaconda3\lib\copy.py in deepcopy(x, memo, _nil)
178 y = x
179 else:
--> 180 y = _reconstruct(x, memo, *rv)
181
182 # If is its own copy, don't memoize.

D:\Continuum\anaconda3\lib\copy.py in _reconstruct(x, memo, func, args, state, listiter, dictiter, deepcopy)
278 if state is not None:
279 if deep:
--> 280 state = deepcopy(state, memo)
281 if hasattr(y, 'setstate'):
282 y.setstate(state)

D:\Continuum\anaconda3\lib\copy.py in deepcopy(x, memo, _nil)
148 copier = _deepcopy_dispatch.get(cls)
149 if copier:
--> 150 y = copier(x, memo)
151 else:
152 try:

D:\Continuum\anaconda3\lib\copy.py in _deepcopy_dict(x, memo, deepcopy)
238 memo[id(x)] = y
239 for key, value in x.items():
--> 240 y[deepcopy(key, memo)] = deepcopy(value, memo)
241 return y
242 d[dict] = _deepcopy_dict

D:\Continuum\anaconda3\lib\copy.py in deepcopy(x, memo, _nil)
178 y = x
179 else:
--> 180 y = _reconstruct(x, memo, *rv)
181
182 # If is its own copy, don't memoize.

D:\Continuum\anaconda3\lib\copy.py in _reconstruct(x, memo, func, args, state, listiter, dictiter, deepcopy)
278 if state is not None:
279 if deep:
--> 280 state = deepcopy(state, memo)
281 if hasattr(y, 'setstate'):
282 y.setstate(state)

D:\Continuum\anaconda3\lib\copy.py in deepcopy(x, memo, _nil)
148 copier = _deepcopy_dispatch.get(cls)
149 if copier:
--> 150 y = copier(x, memo)
151 else:
152 try:

D:\Continuum\anaconda3\lib\copy.py in _deepcopy_dict(x, memo, deepcopy)
238 memo[id(x)] = y
239 for key, value in x.items():
--> 240 y[deepcopy(key, memo)] = deepcopy(value, memo)
241 return y
242 d[dict] = _deepcopy_dict

D:\Continuum\anaconda3\lib\copy.py in deepcopy(x, memo, _nil)
167 reductor = getattr(x, "reduce_ex", None)
168 if reductor:
--> 169 rv = reductor(4)
170 else:
171 reductor = getattr(x, "reduce", None)

TypeError: can't pickle _thread.RLock objects`

from mlens.

flennerhag avatar flennerhag commented on May 14, 2024

Oh, I missed a deepcopy instance used on the meta learner. I've fixed this in the PR branch, would you mind giving it another try?

And thanks, glad you like it! :)

from mlens.

onacrame avatar onacrame commented on May 14, 2024

Still getting an error message after making that change. See below

`Epoch 20/20
294/294 [==============================] - 0s 146us/step - loss: 0.5685

TypeError Traceback (most recent call last)
in ()
16 ensemble.add_meta(KerasClassifier(build_fn=create_model2, epochs=20, batch_size=5),proba=True)
17
---> 18 ensemble.fit(X[:294], y[:294],)
19
20

D:\Continuum\anaconda3\lib\site-packages\mlens\ensemble\base.py in fit(self, X, y, **kwargs)
514 self._id_train.fit(X)
515
--> 516 out = self._backend.fit(X, y, **kwargs)
517 if out is not self._backend:
518 # fit_transform

D:\Continuum\anaconda3\lib\site-packages\mlens\ensemble\base.py in fit(self, X, y, **kwargs)
156 with ParallelProcessing(self.backend, self.n_jobs,
157 max(self.verbose - 4, 0)) as manager:
--> 158 out = manager.stack(self, 'fit', X, y, **kwargs)
159
160 if self.verbose:

D:\Continuum\anaconda3\lib\site-packages\mlens\parallel\backend.py in stack(self, caller, job, X, y, path, return_preds, warm_start, split, **kwargs)
671 job=job, X=X, y=y, path=path, warm_start=warm_start,
672 return_preds=return_preds, split=split, stack=True)
--> 673 return self.process(caller=caller, out=out, **kwargs)
674
675 def process(self, caller, out, **kwargs):

D:\Continuum\anaconda3\lib\site-packages\mlens\parallel\backend.py in process(self, caller, out, **kwargs)
716 self.job.clear()
717
--> 718 self._partial_process(task, parallel, **kwargs)
719
720 if task.name in return_names:

D:\Continuum\anaconda3\lib\site-packages\mlens\parallel\backend.py in _partial_process(self, task, parallel, **kwargs)
737 self._gen_prediction_array(task, self.job.job, self.threading)
738
--> 739 task(self.job.args(**kwargs), parallel=parallel)
740
741 if not task.no_output and getattr(task, 'n_feature_prop', 0):

D:\Continuum\anaconda3\lib\site-packages\mlens\parallel\layer.py in call(self, args, parallel)
157
158 if job == 'fit':
--> 159 self.collect()
160
161 if self.verbose:

D:\Continuum\anaconda3\lib\site-packages\mlens\parallel\layer.py in collect(self, path)
169 transformer.collect(path)
170 for learner in self.learners:
--> 171 learner.collect(path)
172
173 def set_output_columns(self, X, y, job, n_left_concats=0):

D:\Continuum\anaconda3\lib\site-packages\mlens\parallel\learner.py in collect(self, path)
652 learner_data,
653 sublearner_files,
--> 654 sublearner_data) = self._collect(path)
655
656 self.clear()

D:\Continuum\anaconda3\lib\site-packages\mlens\parallel\learner.py in _collect(self, path)
709 if self.only_all:
710 # Sub learners are the same as the sub-learners
--> 711 sublearner_files, sublearner_data = replace(learner_files)
712
713 return learner_files, learner_data, sublearner_files, sublearner_data

D:\Continuum\anaconda3\lib\site-packages\mlens\parallel_base_functions.py in replace(source_files)
67 def replace(source_files):
68 """Utility function to replace empty files list"""
---> 69 # replace_files = [deepcopy(o) for o in source_files]
70 replace_files = source_files.copy()
71 for o in replace_files:

D:\Continuum\anaconda3\lib\site-packages\mlens\parallel_base_functions.py in (.0)
67 def replace(source_files):
68 """Utility function to replace empty files list"""
---> 69 # replace_files = [deepcopy(o) for o in source_files]
70 replace_files = source_files.copy()
71 for o in replace_files:

D:\Continuum\anaconda3\lib\copy.py in deepcopy(x, memo, _nil)
178 y = x
179 else:
--> 180 y = _reconstruct(x, memo, *rv)
181
182 # If is its own copy, don't memoize.

D:\Continuum\anaconda3\lib\copy.py in _reconstruct(x, memo, func, args, state, listiter, dictiter, deepcopy)
278 if state is not None:
279 if deep:
--> 280 state = deepcopy(state, memo)
281 if hasattr(y, 'setstate'):
282 y.setstate(state)

D:\Continuum\anaconda3\lib\copy.py in deepcopy(x, memo, _nil)
148 copier = _deepcopy_dispatch.get(cls)
149 if copier:
--> 150 y = copier(x, memo)
151 else:
152 try:

D:\Continuum\anaconda3\lib\copy.py in _deepcopy_tuple(x, memo, deepcopy)
218
219 def _deepcopy_tuple(x, memo, deepcopy=deepcopy):
--> 220 y = [deepcopy(a, memo) for a in x]
221 # We're not going to put the tuple in the memo, but it's still important we
222 # check for it, in case the tuple contains recursive mutable structures.

D:\Continuum\anaconda3\lib\copy.py in (.0)
218
219 def _deepcopy_tuple(x, memo, deepcopy=deepcopy):
--> 220 y = [deepcopy(a, memo) for a in x]
221 # We're not going to put the tuple in the memo, but it's still important we
222 # check for it, in case the tuple contains recursive mutable structures.

D:\Continuum\anaconda3\lib\copy.py in deepcopy(x, memo, _nil)
178 y = x
179 else:
--> 180 y = _reconstruct(x, memo, *rv)
181
182 # If is its own copy, don't memoize.

D:\Continuum\anaconda3\lib\copy.py in _reconstruct(x, memo, func, args, state, listiter, dictiter, deepcopy)
278 if state is not None:
279 if deep:
--> 280 state = deepcopy(state, memo)
281 if hasattr(y, 'setstate'):
282 y.setstate(state)

D:\Continuum\anaconda3\lib\copy.py in deepcopy(x, memo, _nil)
148 copier = _deepcopy_dispatch.get(cls)
149 if copier:
--> 150 y = copier(x, memo)
151 else:
152 try:

D:\Continuum\anaconda3\lib\copy.py in _deepcopy_dict(x, memo, deepcopy)
238 memo[id(x)] = y
239 for key, value in x.items():
--> 240 y[deepcopy(key, memo)] = deepcopy(value, memo)
241 return y
242 d[dict] = _deepcopy_dict

D:\Continuum\anaconda3\lib\copy.py in deepcopy(x, memo, _nil)
178 y = x
179 else:
--> 180 y = _reconstruct(x, memo, *rv)
181
182 # If is its own copy, don't memoize.

D:\Continuum\anaconda3\lib\copy.py in _reconstruct(x, memo, func, args, state, listiter, dictiter, deepcopy)
278 if state is not None:
279 if deep:
--> 280 state = deepcopy(state, memo)
281 if hasattr(y, 'setstate'):
282 y.setstate(state)

D:\Continuum\anaconda3\lib\copy.py in deepcopy(x, memo, _nil)
148 copier = _deepcopy_dispatch.get(cls)
149 if copier:
--> 150 y = copier(x, memo)
151 else:
152 try:

D:\Continuum\anaconda3\lib\copy.py in _deepcopy_dict(x, memo, deepcopy)
238 memo[id(x)] = y
239 for key, value in x.items():
--> 240 y[deepcopy(key, memo)] = deepcopy(value, memo)
241 return y
242 d[dict] = _deepcopy_dict

D:\Continuum\anaconda3\lib\copy.py in deepcopy(x, memo, _nil)
178 y = x
179 else:
--> 180 y = _reconstruct(x, memo, *rv)
181
182 # If is its own copy, don't memoize.

D:\Continuum\anaconda3\lib\copy.py in _reconstruct(x, memo, func, args, state, listiter, dictiter, deepcopy)
278 if state is not None:
279 if deep:
--> 280 state = deepcopy(state, memo)
281 if hasattr(y, 'setstate'):
282 y.setstate(state)

D:\Continuum\anaconda3\lib\copy.py in deepcopy(x, memo, _nil)
148 copier = _deepcopy_dispatch.get(cls)
149 if copier:
--> 150 y = copier(x, memo)
151 else:
152 try:

D:\Continuum\anaconda3\lib\copy.py in _deepcopy_dict(x, memo, deepcopy)
238 memo[id(x)] = y
239 for key, value in x.items():
--> 240 y[deepcopy(key, memo)] = deepcopy(value, memo)
241 return y
242 d[dict] = _deepcopy_dict

D:\Continuum\anaconda3\lib\copy.py in deepcopy(x, memo, _nil)
148 copier = _deepcopy_dispatch.get(cls)
149 if copier:
--> 150 y = copier(x, memo)
151 else:
152 try:

D:\Continuum\anaconda3\lib\copy.py in _deepcopy_list(x, memo, deepcopy)
213 append = y.append
214 for a in x:
--> 215 append(deepcopy(a, memo))
216 return y
217 d[list] = _deepcopy_list

D:\Continuum\anaconda3\lib\copy.py in deepcopy(x, memo, _nil)
178 y = x
179 else:
--> 180 y = _reconstruct(x, memo, *rv)
181
182 # If is its own copy, don't memoize.

D:\Continuum\anaconda3\lib\copy.py in _reconstruct(x, memo, func, args, state, listiter, dictiter, deepcopy)
278 if state is not None:
279 if deep:
--> 280 state = deepcopy(state, memo)
281 if hasattr(y, 'setstate'):
282 y.setstate(state)

D:\Continuum\anaconda3\lib\copy.py in deepcopy(x, memo, _nil)
148 copier = _deepcopy_dispatch.get(cls)
149 if copier:
--> 150 y = copier(x, memo)
151 else:
152 try:

D:\Continuum\anaconda3\lib\copy.py in _deepcopy_dict(x, memo, deepcopy)
238 memo[id(x)] = y
239 for key, value in x.items():
--> 240 y[deepcopy(key, memo)] = deepcopy(value, memo)
241 return y
242 d[dict] = _deepcopy_dict

D:\Continuum\anaconda3\lib\copy.py in deepcopy(x, memo, _nil)
178 y = x
179 else:
--> 180 y = _reconstruct(x, memo, *rv)
181
182 # If is its own copy, don't memoize.

D:\Continuum\anaconda3\lib\copy.py in _reconstruct(x, memo, func, args, state, listiter, dictiter, deepcopy)
278 if state is not None:
279 if deep:
--> 280 state = deepcopy(state, memo)
281 if hasattr(y, 'setstate'):
282 y.setstate(state)

D:\Continuum\anaconda3\lib\copy.py in deepcopy(x, memo, _nil)
148 copier = _deepcopy_dispatch.get(cls)
149 if copier:
--> 150 y = copier(x, memo)
151 else:
152 try:

D:\Continuum\anaconda3\lib\copy.py in _deepcopy_dict(x, memo, deepcopy)
238 memo[id(x)] = y
239 for key, value in x.items():
--> 240 y[deepcopy(key, memo)] = deepcopy(value, memo)
241 return y
242 d[dict] = _deepcopy_dict

D:\Continuum\anaconda3\lib\copy.py in deepcopy(x, memo, _nil)
178 y = x
179 else:
--> 180 y = _reconstruct(x, memo, *rv)
181
182 # If is its own copy, don't memoize.

D:\Continuum\anaconda3\lib\copy.py in _reconstruct(x, memo, func, args, state, listiter, dictiter, deepcopy)
278 if state is not None:
279 if deep:
--> 280 state = deepcopy(state, memo)
281 if hasattr(y, 'setstate'):
282 y.setstate(state)

D:\Continuum\anaconda3\lib\copy.py in deepcopy(x, memo, _nil)
148 copier = _deepcopy_dispatch.get(cls)
149 if copier:
--> 150 y = copier(x, memo)
151 else:
152 try:

D:\Continuum\anaconda3\lib\copy.py in _deepcopy_dict(x, memo, deepcopy)
238 memo[id(x)] = y
239 for key, value in x.items():
--> 240 y[deepcopy(key, memo)] = deepcopy(value, memo)
241 return y
242 d[dict] = _deepcopy_dict

D:\Continuum\anaconda3\lib\copy.py in deepcopy(x, memo, _nil)
167 reductor = getattr(x, "reduce_ex", None)
168 if reductor:
--> 169 rv = reductor(4)
170 else:
171 reductor = getattr(x, "reduce", None)

TypeError: can't pickle _thread.RLock objects`

from mlens.

flennerhag avatar flennerhag commented on May 14, 2024

It looks like you haven't successfully installed the PR #98 branch of mlens: it's the same line that's causing the error, but it is now fixed in the PR.

The key line in your traceback is 69 in the snippet below:

D:\Continuum\anaconda3\lib\site-packages\mlens\parallel_base_functions.py in replace(source_files)
67 def replace(source_files):
68 """Utility function to replace empty files list"""
---> 69 # replace_files = [deepcopy(o) for o in source_files]
70 replace_files = source_files.copy()
71 for o in replace_files:

To get this is crazy because 69 appears to be commented out and should not be executed. But more to the point, in PR #98, this line does not even exist. In fact, the file the traceback is pointing to
(mlens\parallel_base_functions.py) doesn't exist in v2 at all.

It looks like you're not running the actual PR #98, but on something else? If you have your own version, you can just remove the offending line. To install the PR, run

pip uninstall mlens;
git clone https://github.com/flennerhag/mlens; cd mlens;
git fetch;
git checkout deepcopy;
pip install .

from mlens.

onacrame avatar onacrame commented on May 14, 2024

Great thanks that works!

from mlens.

flennerhag avatar flennerhag commented on May 14, 2024

That's likely because you are using multithreading / multiprocessing in the KerasClassifier (which was why you got the first error). Would you mind checking a few things:

  • first, can you turn multithreading / processing off in the classifier?

  • second, does it fail under both backend=threading and backend=multiprocessing?

  • third, if it fails under backend=multiprocessing, could you try using different start methods: i.e. at the top of your script, after the imports, add mlens.config.set_start_method(method) where method is one of 'fork', 'spawn', 'forkserver'.

from mlens.

onacrame avatar onacrame commented on May 14, 2024

That's likely because you are using multithreading / multiprocessing in the KerasClassifier (which was why you got the first error). Would you mind checking a few things:

first, can you turn multithreading / processing off in the classifier?

second, does it fail under both backend=threading and backend=multiprocessing?

third, if it fails under backend=multiprocessing, could you try using different start methods: i.e. at the top of your script, after the imports, add mlens.config.set_start_method(method) where method is one of 'fork', 'spawn', 'forkserver'.

I had looked into whether Keras can be forced into single threaded mode. I had found some code to do that but it did not work in my case.

keras-team/keras#4740
tensorflow/tensorflow#11066

This one goes into some issues with Keras and joblib
keras-team/keras#3181

I had been using a tensorflow backend. I was able to solve the problem by switching to the CNTK backend (I've not tried Theano).

To answer your other questions, prior to switching to the CNTK back end I tried your multiprocessing suggestion.

In the first instance multiprocessing just seemed to hangs in the preprocessing phase. I tried fork and forkserver but that did not help. I checked my task manager processes and there were some additional processes in the background but they seemed to be using very little memory or CPU power.

I tried the "spawn" option next. Now that did get me past the preprocessing phase but it hung when it got to the KerasClassifier model.

I should mention that I was able to use KerasClassifier with the Tensorflow backend in one circumstance wheere n_jobs = -1. It only worked if it was the only model in a layer. So I could have a KerasClassifier model as a sole layer and it would run just fine but once other models were in that layer that's where it failed.

Anyway, I think the short term solution for someone with this same issue is to try switching the Keras backend as the Tensorflow backend creates issues.

This is my script by the way. This was just a simple example I was testing this package with to get a feel for how it works using the Pima Indians data.

from sklearn.ensemble import GradientBoostingClassifier
from sklearn.preprocessing import MinMaxScaler
import pandas as pd
import numpy as np
from mlens.ensemble import SuperLearner
from sklearn.linear_model import LogisticRegression
from keras.wrappers.scikit_learn import KerasClassifier
from keras.models import Sequential
from keras.layers import Dense
import mlens

# In[2]:


mlens.config.set_start_method('spawn')

# In[3]:


data = pd.read_csv('d:/PimaIndians.csv')

X = data.drop('test', axis=1).values
y = data['test'].values

seed = 2017
np.random.seed(seed)

def create_model(optimizer='adam', init='uniform'):
    # create model
    model = Sequential()
    model.add(Dense(16, input_dim=4, kernel_initializer=init, activation='relu'))
    model.add(Dense(8, kernel_initializer=init, activation='relu'))
    model.add(Dense(1, kernel_initializer=init, activation='sigmoid'))
    # Compile model
    model.compile(loss='binary_crossentropy', optimizer=optimizer)
    return model

ensemble = SuperLearner(random_state=seed, verbose=10, n_jobs=-1, backend='multiprocessing')

# Build the first layer
ensemble.add([GradientBoostingClassifier(random_state=seed), LogisticRegression()], preprocessing=MinMaxScaler(),
             proba=True)

# Build the second layer
ensemble.add([KerasClassifier(build_fn=create_model, epochs=5, batch_size=10), LogisticRegression(penalty='l1'),
              LogisticRegression(penalty='l2')], preprocessing=MinMaxScaler(), proba=True)

# Attach the final meta estimator
ensemble.add_meta([LogisticRegression(penalty='l1')], proba=True)

ensemble.fit(X[:294], y[:294], )

By the way, I noticed that when I do include a neural network in the ensemble and I have, for example two models feeding into it in the preceding layer, the expected input dimension is 4. Do you know why that is?

E.g.
Layer1: RandomForest, Logistic Regression
Layer2: Neural Network

I would have expected the input dimension for the neural network to be 2 rather than 4. I'm always using probabilities via the proba keyword. Is it that Layer 1 is sending the predicted class AND the probability downstream or is it that it is sending for a given sample, as an example, 95% and 5%.

Is there a way to see the predictions of the individual layers?

from mlens.

flennerhag avatar flennerhag commented on May 14, 2024

That’s super helpful, thanks! I think TF has their own process manager and two chefs doesn’t make a good soup. Weird that it works when it’s the only model in the layer though, makes me think that it’s the other models that hang because of TF, and not the KerasClassifier.

from mlens.

flennerhag avatar flennerhag commented on May 14, 2024

And as for your other question, the input dim is n_models * n_classes. If you use subsembles, it is additionally multiplied by the number of partitions. It’s a bit redundant to feed in all n_classes, but fixing it is not a priority.

And yes, you can see the predictions of a layer by specifying the return_preds argument in the fit or predict call. Set it to a list to get multiple layers at a time.

from mlens.

onacrame avatar onacrame commented on May 14, 2024

Gotcha, thanks!

from mlens.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.