Comments (24)
hi,I think I know how to save and load the model. you can save the params of the model. and load it by set_value() function.
example
save:
with open('./modelfile', 'wb') as f:
cPickle.dump(classifier.params, f, -1)
load:
with open('./model/classifier4class_all20160728.params', 'rb') as f:
tmp = cPickle.load(f)
for i in range(len(classifier.params)):
classifier.params[i].set_value(tmp[i].get_value())
of course, before load the classifier, you should define it, just like the train process.
from cnn_sentence.
Same question.Did you find a way?
from cnn_sentence.
Once you are happy with your trained network you can use python pickle module to serialize the object in in a file using the method dump
and for loading use the method load
.
python pickle : https://docs.python.org/2/library/pickle.html
from cnn_sentence.
In this code, which object should I serialize? I've tried to use the method dump to serialize 'classifier',but it didn't work.
PicklingError: Can't pickle <type 'instancemethod'>
from cnn_sentence.
@guanxingke you should serialize the object named "params" which stored the parameter in the model. When you predict a new document classification, you should reconstruct the neutral network and set the parameter as what you have saved
from cnn_sentence.
@guanxingke you can use what @MarkWuNLP has mentioned above. Instead I reconstructed the code to a class having two methods.
def save(self, path):
with open(path, 'wb') as f:
pickle.dump(self, f, -1)
logger.info('save model to path %s' % path)
return None
@classmethod
def load(self, path):
with open(path, 'rb') as f:
return pickle.load(f)
from cnn_sentence.
@MarkWuNLP Thanks for your answer. I still have some questions.
There are two params,①params in function train_conv_net,②classifier.params.
I assume that you mean the first one.When should I serialize the params? And when should I load the params?
this is my code:
params = classifier.params
for conv_layer in conv_layers:
params += conv_layer.params
if non_static:
#if word vectors are allowed to change, add them as model parameters
params += [Words]
# f = open("params.save", "wb")
# cPickle.dump(params, f)
# f.close()
# print 'Params saved.'
print "Loading params..."
fr = open("params.save","rb")
params = cPickle.load(fr)
fr.close()
I save the params first, and load it next time. It doesn't work:
Loading params...
Traceback (most recent call last):
File "D:\Program Files (x86)\JetBrains\PyCharm Community Edition 4.5.4\helpers\pydev\pydevd.py", line 2358, in
globals = debugger.run(setup['file'], None, None, is_module)
File "D:\Program Files (x86)\JetBrains\PyCharm Community Edition 4.5.4\helpers\pydev\pydevd.py", line 1778, in run
pydev_imports.execfile(file, globals, locals) # execute the script
File "F:/PythonWorkspace/CNN_sentence-master/conv_net_sentence.py", line 339, in
dropout_rate=[0.5])
File "F:/PythonWorkspace/CNN_sentence-master/conv_net_sentence.py", line 115, in train_conv_net
grad_updates = sgd_updates_adadelta(params, dropout_cost, lr_decay, 1e-6, sqr_norm_lim)
File "F:/PythonWorkspace/CNN_sentence-master/conv_net_sentence.py", line 233, in sgd_updates_adadelta
gp = T.grad(cost, param)
File "C:\Python27\lib\site-packages\theano-0.7.0-py2.7.egg\theano\gradient.py", line 529, in grad
handle_disconnected(elem)
File "C:\Python27\lib\site-packages\theano-0.7.0-py2.7.egg\theano\gradient.py", line 516, in handle_disconnected
raise DisconnectedInputError(message)
theano.gradient.DisconnectedInputError: grad method was asked to compute the gradient with respect to a variable that is not part of the computational graph of the cost, or is used only by a non-differentiable operator: W
If I load the params before the "while (epoch < n_epochs):" loop, I don't see how the params could effect my result.
from cnn_sentence.
@Huarong Thanks for your answer.
I assume that you mean the class is "class MLPDropout".
And I save the class like this :
classifier.save('classifier.save')
It's the same error I mention before:
Traceback (most recent call last):
File "D:\Program Files (x86)\JetBrains\PyCharm Community Edition 4.5.4\helpers\pydev\pydevd.py", line 2358, in
globals = debugger.run(setup['file'], None, None, is_module)
File "D:\Program Files (x86)\JetBrains\PyCharm Community Edition 4.5.4\helpers\pydev\pydevd.py", line 1778, in run
pydev_imports.execfile(file, globals, locals) # execute the script
File "F:/PythonWorkspace/CNN_sentence-master/conv_net_sentence.py", line 342, in
dropout_rate=[0.5])
File "F:/PythonWorkspace/CNN_sentence-master/conv_net_sentence.py", line 170, in train_conv_net
classifier.save('classifier.save')
File "conv_net_classes.py", line 183, in save
pickle.dump(self, f, -1)
File "C:\Python27\lib\pickle.py", line 1370, in dump
Pickler(file, protocol).dump(obj)
File "C:\Python27\lib\pickle.py", line 224, in dump
self.save(obj)
File "C:\Python27\lib\pickle.py", line 331, in save
self.save_reduce(obj=obj, *rv)
File "C:\Python27\lib\pickle.py", line 419, in save_reduce
save(state)
File "C:\Python27\lib\pickle.py", line 286, in save
f(self, obj) # Call unbound method with explicit self
File "C:\Python27\lib\pickle.py", line 649, in save_dict
self._batch_setitems(obj.iteritems())
File "C:\Python27\lib\pickle.py", line 681, in _batch_setitems
save(v)
File "C:\Python27\lib\pickle.py", line 331, in save
self.save_reduce(obj=obj, *rv)
File "C:\Python27\lib\pickle.py", line 396, in save_reduce
save(cls)
File "C:\Python27\lib\pickle.py", line 286, in save
f(self, obj) # Call unbound method with explicit self
File "C:\Python27\lib\pickle.py", line 748, in save_global
(obj, module, name))
pickle.PicklingError: Can't pickle <type 'instancemethod'>: it's not found as builtin.instancemethod
from cnn_sentence.
@guanxingke No, I mean both cnn parameters and mlp features. I used CPickle to stored the params successfully.
savefile = file('obj.save', 'wb')
cPickle.dump(params,savefile,protocol=cPickle.HIGHEST_PROTOCOL)
Furthermore, when you use the stored feature. conv_layer.W_conv and conv_layer.b_conv should be reset as well as self.dropout_layers.W self.dropout_layers.b
from cnn_sentence.
@MarkWuNLP I stored the params successfully too, but how do I use it to predict what I write?
Could you show some examples? Like when and how I load the params, and make some further predict.
Thanks again.
from cnn_sentence.
@chasonlee I have the same problem as you. Did you find an answer to your last question ? Any example ?
Thanks
from cnn_sentence.
@huydan , @chasonlee
Did you guys find any example ?_?
from cnn_sentence.
@chasonlee , @huydan , @Muugii-bs
Hey guys, I have made some modifications in the code so that further predictions can be made on some test examples. You can found it here:
https://github.com/DeepanwayGhosal/CNN_sentence
Let me know if this works.
from cnn_sentence.
Thank you @DeepanwayGhosal 👍
I have a question. Where in the code, do you load the saved parameters ?
from cnn_sentence.
@Muugii-bs I don't actually save the parameters. Along with train and validation set I also pass the test set in the function train_conv_net() and it returns predicted test labels.
In the train_conv_net() function you can find a code snippet between these two comment lines which predicts the test labels.
# So we make prediction only by taking a maximum of 2000 test examples at a time ...... ...... ....... #start training over mini-batches
@huydan mentioned in the his last comment that he wants a prediction example. So I have only done that here. I will try to make a model which saves and loads the parameters and can make predictions even without passing the test set in the train_conv_net() function.
from cnn_sentence.
@deepanwayx @521088684 can you please explain more about how the code should change, if we want to save the trained model and load it in the test time to predict the sentiment for one instance only?
from cnn_sentence.
@chasonlee did you find any example on how you should save and then predict based on your saved model? Can you please share those examples?
from cnn_sentence.
@521088684 Can you please specify where exactly I should dump and where I should load the model? I have dumped it at the end of this if:
if val_perf >= best_val_perf:
and loaded it after
classifier = MLPDropout(rng, input=layer1_input, layer_sizes=hidden_units, activations=activations, dropout_rates=dropout_rate)
#define parameters of the model and update functions using adadelta
params = classifier.params
for conv_layer in conv_layers:
params += conv_layer.params
if non_static:
#if word vectors are allowed to change, add them as model parameters
params += [Words]
and off course removed the training part and tested it on my test set.
But the problem is that when I am applying it on my test data most of them would be assigned to the negative class, that is not interpretable based on the accuracy that I have got.
from cnn_sentence.
is there someone who has already successfully implemented how to save and load the model?? how to save the trained model and load it in the test time to predict the sentiment for one instance only?
from cnn_sentence.
Same request. If someone has worked this out successfully, it would be good to get some insight.
from cnn_sentence.
Hi,
for me the suggestion from @521088684 works.
What I actually did was, that I added to the MLPDropout-class the following method:
`
def save(self, path):
with open(path, 'wb') as f:
pickle.dump(self.params, f, -1)
return None
Then in the train_conv_net()-method you can insert
classifier.save("/.../")` anywhere after the initialization (ideally after checking if the current model is the best one).
Then I added to the conv_net_sentences.py file a new method where I actually copied the train_conv_net()-method with a few changes. the first part of this method looks like
`
rng = np.random.RandomState(3435)
img_h = len(datasets[0])-1
filter_w = img_w
feature_maps = hidden_units[0]
filter_shapes = []
pool_sizes = []
for filter_h in filter_hs:
filter_shapes.append((feature_maps, 1, filter_h, filter_w))
pool_sizes.append((img_h-filter_h+1, img_w-filter_w+1))
'''
define model architecture
'''
index = T.lscalar()
x = T.matrix('x')
y = T.ivector('y')
Words = theano.shared(value=U, name="Words")
zero_vec_tensor = T.vector()
zero_vec = np.zeros(img_w)
set_zero = theano.function([zero_vec_tensor], updates=[(Words, T.set_subtensor(Words[0, :], zero_vec_tensor))],
allow_input_downcast=True)
layer0_input = Words[T.cast(x.flatten(), dtype="int32")].reshape((x.shape[0], 1, x.shape[1], Words.shape[1]))
conv_layers = []
layer1_inputs = []
for i in xrange(len(filter_hs)):
filter_shape = filter_shapes[i]
pool_size = pool_sizes[i]
conv_layer = LeNetConvPoolLayer(rng, input=layer0_input, image_shape=(batch_size, 1, img_h, img_w),
filter_shape=filter_shape, poolsize=pool_size, non_linear=conv_non_linear)
layer1_input = conv_layer.output.flatten(2)
conv_layers.append(conv_layer)
layer1_inputs.append(layer1_input)
layer1_input = T.concatenate(layer1_inputs, 1)
hidden_units[0] = feature_maps*len(filter_hs)
classifier = MLPDropout(rng, input=layer1_input, layer_sizes=hidden_units, activations=activations,
dropout_rates=dropout_rate)
'''
define parameters of the model and update functions using adadelta
'''
params = classifier.params
for conv_layer in conv_layers:
params += conv_layer.params
if non_static:
# if word vectors are allowed to change, add them as model parameters
params += [Words]
cost = classifier.negative_log_likelihood(y)
dropout_cost = classifier.dropout_negative_log_likelihood(y)
grad_updates = sgd_updates_adadelta(params, dropout_cost, lr_decay, 1e-6, sqr_norm_lim)
'''
load model parameters
'''
with open(params_file, 'rb') as f:
tmp = cPickle.load(f)
for i in range(len(classifier.params)):
classifier.params[i].set_value(tmp[i].get_value())
del tmp
'''
prepare datasets
2) split x and y axis
'''
np.random.seed(3435)
test_set_x = datasets[:, :img_h]
test_set_y = np.asarray(datasets[:, -1], "int32")
'''
models
'''
test_pred_layers = []
test_size = batch_size # modified line
test_layer0_input = Words[T.cast(x.flatten(), dtype="int32")].reshape((test_size, 1, img_h, Words.shape[1]))
for conv_layer in conv_layers:
test_layer0_output = conv_layer.predict(test_layer0_input, test_size)
test_pred_layers.append(test_layer0_output.flatten(2))
test_layer1_input = T.concatenate(test_pred_layers, 1)
test_y_pred = classifier.predict(test_layer1_input)
test_y_pred_p = classifier.predict_p(test_layer1_input)
test_y_pred_p_reduce = test_y_pred_p[:, 0]
test_error = T.mean(T.neq(test_y_pred, y))
test_model_all = theano.function([x, y], test_error, allow_input_downcast=True)
test_predict = theano.function([x], test_y_pred, allow_input_downcast=True)
test_probs = theano.function([x], test_y_pred_p_reduce, allow_input_downcast=True)
`
Afterwards you only have to split into batches and perform the test_-functions :-)
from cnn_sentence.
@pexmar could you give examples of how to use these test_-functions? I have no idea about their functions. Thank you so much
from cnn_sentence.
@Zero0one1 Did you succeed to use these functions ?
from cnn_sentence.
@Zero0one1 Did you succeed to use these functions ?
No. I saved the model successfully but don't know how to predict new input. Do you have any idea?
from cnn_sentence.
Related Issues (20)
- AttributeError: 'module' object has no attribute 'LeNetConvPoolLayer'
- question regarding datasets HOT 2
- Dealing with overfitting HOT 1
- success with CUDA 7.5? HOT 1
- about how many times does the iteration of experiment train
- Word Embddings in Non-static Mode HOT 2
- test_model in file conv_net_sentence.py
- Permissions Denied when loading GoogleNews-vectors-negative300.bin
- how about the size of feature map? HOT 1
- How do i get the name of every layer and their size?Someone knows?
- question about clearn_str in process_data.py HOT 1
- confused on the dropout_cost_p and cost_p ??
- how much RAM do i need to process Google News dataset bin model file? HOT 3
- Confused with vocab in process_data.py, need heeeeeeelp HOT 2
- License?
- multilabel classificaion HOT 2
- What does '<PAD>' stands for?
- How much memory do I need to process bin file (i.e. GoogleNews-vectors-negative300.bin) HOT 3
- a pickle file problem HOT 3
- NotImplementedError: The image and the kernel must have the same type.inputs(float32), kerns(float64)
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from cnn_sentence.