Coder Social home page Coder Social logo

Save trained model about cnn_sentence HOT 24 OPEN

yoonkim avatar yoonkim commented on September 26, 2024
Save trained model

from cnn_sentence.

Comments (24)

521088684 avatar 521088684 commented on September 26, 2024 3

hi,I think I know how to save and load the model. you can save the params of the model. and load it by set_value() function.
example
save:
with open('./modelfile', 'wb') as f:
cPickle.dump(classifier.params, f, -1)

load:
with open('./model/classifier4class_all20160728.params', 'rb') as f:
tmp = cPickle.load(f)
for i in range(len(classifier.params)):
classifier.params[i].set_value(tmp[i].get_value())

of course, before load the classifier, you should define it, just like the train process.

from cnn_sentence.

chaseleecn avatar chaseleecn commented on September 26, 2024

Same question.Did you find a way?

from cnn_sentence.

hadyelsahar avatar hadyelsahar commented on September 26, 2024

Once you are happy with your trained network you can use python pickle module to serialize the object in in a file using the method dump and for loading use the method load.
python pickle : https://docs.python.org/2/library/pickle.html

from cnn_sentence.

chaseleecn avatar chaseleecn commented on September 26, 2024

In this code, which object should I serialize? I've tried to use the method dump to serialize 'classifier',but it didn't work.
PicklingError: Can't pickle <type 'instancemethod'>

from cnn_sentence.

MarkWuNLP avatar MarkWuNLP commented on September 26, 2024

@guanxingke you should serialize the object named "params" which stored the parameter in the model. When you predict a new document classification, you should reconstruct the neutral network and set the parameter as what you have saved

from cnn_sentence.

Huarong avatar Huarong commented on September 26, 2024

@guanxingke you can use what @MarkWuNLP has mentioned above. Instead I reconstructed the code to a class having two methods.

def save(self, path):
    with open(path, 'wb') as f:
        pickle.dump(self, f, -1)
    logger.info('save model to path %s' % path)
    return None

@classmethod
def load(self, path):
    with open(path, 'rb') as f:
        return pickle.load(f)

from cnn_sentence.

chaseleecn avatar chaseleecn commented on September 26, 2024

@MarkWuNLP Thanks for your answer. I still have some questions.
There are two params,①params in function train_conv_net,②classifier.params.
I assume that you mean the first one.When should I serialize the params? And when should I load the params?

this is my code:
params = classifier.params
for conv_layer in conv_layers:
params += conv_layer.params
if non_static:
#if word vectors are allowed to change, add them as model parameters
params += [Words]

# f = open("params.save", "wb")
# cPickle.dump(params, f)
# f.close()
# print 'Params saved.'

print "Loading params..."
fr = open("params.save","rb")
params = cPickle.load(fr)
fr.close()

I save the params first, and load it next time. It doesn't work:

Loading params...
Traceback (most recent call last):
File "D:\Program Files (x86)\JetBrains\PyCharm Community Edition 4.5.4\helpers\pydev\pydevd.py", line 2358, in
globals = debugger.run(setup['file'], None, None, is_module)
File "D:\Program Files (x86)\JetBrains\PyCharm Community Edition 4.5.4\helpers\pydev\pydevd.py", line 1778, in run
pydev_imports.execfile(file, globals, locals) # execute the script
File "F:/PythonWorkspace/CNN_sentence-master/conv_net_sentence.py", line 339, in
dropout_rate=[0.5])
File "F:/PythonWorkspace/CNN_sentence-master/conv_net_sentence.py", line 115, in train_conv_net
grad_updates = sgd_updates_adadelta(params, dropout_cost, lr_decay, 1e-6, sqr_norm_lim)
File "F:/PythonWorkspace/CNN_sentence-master/conv_net_sentence.py", line 233, in sgd_updates_adadelta
gp = T.grad(cost, param)
File "C:\Python27\lib\site-packages\theano-0.7.0-py2.7.egg\theano\gradient.py", line 529, in grad
handle_disconnected(elem)
File "C:\Python27\lib\site-packages\theano-0.7.0-py2.7.egg\theano\gradient.py", line 516, in handle_disconnected
raise DisconnectedInputError(message)
theano.gradient.DisconnectedInputError: grad method was asked to compute the gradient with respect to a variable that is not part of the computational graph of the cost, or is used only by a non-differentiable operator: W

If I load the params before the "while (epoch < n_epochs):" loop, I don't see how the params could effect my result.

from cnn_sentence.

chaseleecn avatar chaseleecn commented on September 26, 2024

@Huarong Thanks for your answer.
I assume that you mean the class is "class MLPDropout".
And I save the class like this :

classifier.save('classifier.save')

It's the same error I mention before:

Traceback (most recent call last):
File "D:\Program Files (x86)\JetBrains\PyCharm Community Edition 4.5.4\helpers\pydev\pydevd.py", line 2358, in
globals = debugger.run(setup['file'], None, None, is_module)
File "D:\Program Files (x86)\JetBrains\PyCharm Community Edition 4.5.4\helpers\pydev\pydevd.py", line 1778, in run
pydev_imports.execfile(file, globals, locals) # execute the script
File "F:/PythonWorkspace/CNN_sentence-master/conv_net_sentence.py", line 342, in
dropout_rate=[0.5])
File "F:/PythonWorkspace/CNN_sentence-master/conv_net_sentence.py", line 170, in train_conv_net
classifier.save('classifier.save')
File "conv_net_classes.py", line 183, in save
pickle.dump(self, f, -1)
File "C:\Python27\lib\pickle.py", line 1370, in dump
Pickler(file, protocol).dump(obj)
File "C:\Python27\lib\pickle.py", line 224, in dump
self.save(obj)
File "C:\Python27\lib\pickle.py", line 331, in save
self.save_reduce(obj=obj, *rv)
File "C:\Python27\lib\pickle.py", line 419, in save_reduce
save(state)
File "C:\Python27\lib\pickle.py", line 286, in save
f(self, obj) # Call unbound method with explicit self
File "C:\Python27\lib\pickle.py", line 649, in save_dict
self._batch_setitems(obj.iteritems())
File "C:\Python27\lib\pickle.py", line 681, in _batch_setitems
save(v)
File "C:\Python27\lib\pickle.py", line 331, in save
self.save_reduce(obj=obj, *rv)
File "C:\Python27\lib\pickle.py", line 396, in save_reduce
save(cls)
File "C:\Python27\lib\pickle.py", line 286, in save
f(self, obj) # Call unbound method with explicit self
File "C:\Python27\lib\pickle.py", line 748, in save_global
(obj, module, name))
pickle.PicklingError: Can't pickle <type 'instancemethod'>: it's not found as builtin.instancemethod

from cnn_sentence.

MarkWuNLP avatar MarkWuNLP commented on September 26, 2024

@guanxingke No, I mean both cnn parameters and mlp features. I used CPickle to stored the params successfully.

savefile = file('obj.save', 'wb')
cPickle.dump(params,savefile,protocol=cPickle.HIGHEST_PROTOCOL)

Furthermore, when you use the stored feature. conv_layer.W_conv and conv_layer.b_conv should be reset as well as self.dropout_layers.W self.dropout_layers.b

from cnn_sentence.

chaseleecn avatar chaseleecn commented on September 26, 2024

@MarkWuNLP I stored the params successfully too, but how do I use it to predict what I write?
Could you show some examples? Like when and how I load the params, and make some further predict.
Thanks again.

from cnn_sentence.

huydan avatar huydan commented on September 26, 2024

@chasonlee I have the same problem as you. Did you find an answer to your last question ? Any example ?
Thanks

from cnn_sentence.

Muugii-bs avatar Muugii-bs commented on September 26, 2024

@huydan , @chasonlee
Did you guys find any example ?_?

from cnn_sentence.

deepanwayx avatar deepanwayx commented on September 26, 2024

@chasonlee , @huydan , @Muugii-bs
Hey guys, I have made some modifications in the code so that further predictions can be made on some test examples. You can found it here:
https://github.com/DeepanwayGhosal/CNN_sentence
Let me know if this works.

from cnn_sentence.

Muugii-bs avatar Muugii-bs commented on September 26, 2024

Thank you @DeepanwayGhosal 👍
I have a question. Where in the code, do you load the saved parameters ?

from cnn_sentence.

deepanwayx avatar deepanwayx commented on September 26, 2024

@Muugii-bs I don't actually save the parameters. Along with train and validation set I also pass the test set in the function train_conv_net() and it returns predicted test labels.

In the train_conv_net() function you can find a code snippet between these two comment lines which predicts the test labels.

# So we make prediction only by taking a maximum of 2000 test examples at a time ...... ...... ....... #start training over mini-batches

@huydan mentioned in the his last comment that he wants a prediction example. So I have only done that here. I will try to make a model which saves and loads the parameters and can make predictions even without passing the test set in the train_conv_net() function.

from cnn_sentence.

Monireh2 avatar Monireh2 commented on September 26, 2024

@deepanwayx @521088684 can you please explain more about how the code should change, if we want to save the trained model and load it in the test time to predict the sentiment for one instance only?

from cnn_sentence.

Monireh2 avatar Monireh2 commented on September 26, 2024

@chasonlee did you find any example on how you should save and then predict based on your saved model? Can you please share those examples?

from cnn_sentence.

Monireh2 avatar Monireh2 commented on September 26, 2024

@521088684 Can you please specify where exactly I should dump and where I should load the model? I have dumped it at the end of this if:
if val_perf >= best_val_perf:

and loaded it after

classifier = MLPDropout(rng, input=layer1_input, layer_sizes=hidden_units, activations=activations, dropout_rates=dropout_rate)

#define parameters of the model and update functions using adadelta
params = classifier.params     
for conv_layer in conv_layers:
    params += conv_layer.params
if non_static:
    #if word vectors are allowed to change, add them as model parameters
    params += [Words]

and off course removed the training part and tested it on my test set.

But the problem is that when I am applying it on my test data most of them would be assigned to the negative class, that is not interpretable based on the accuracy that I have got.

from cnn_sentence.

Denybarros avatar Denybarros commented on September 26, 2024

is there someone who has already successfully implemented how to save and load the model?? how to save the trained model and load it in the test time to predict the sentiment for one instance only?

from cnn_sentence.

abishekh avatar abishekh commented on September 26, 2024

Same request. If someone has worked this out successfully, it would be good to get some insight.

from cnn_sentence.

pexmar avatar pexmar commented on September 26, 2024

Hi,

for me the suggestion from @521088684 works.
What I actually did was, that I added to the MLPDropout-class the following method:
`

def save(self, path):
    with open(path, 'wb') as f:
        pickle.dump(self.params, f, -1)
    return None

Then in the train_conv_net()-method you can insertclassifier.save("/.../")` anywhere after the initialization (ideally after checking if the current model is the best one).

Then I added to the conv_net_sentences.py file a new method where I actually copied the train_conv_net()-method with a few changes. the first part of this method looks like
`
rng = np.random.RandomState(3435)
img_h = len(datasets[0])-1
filter_w = img_w
feature_maps = hidden_units[0]
filter_shapes = []
pool_sizes = []
for filter_h in filter_hs:
filter_shapes.append((feature_maps, 1, filter_h, filter_w))
pool_sizes.append((img_h-filter_h+1, img_w-filter_w+1))

'''
define model architecture
'''
index = T.lscalar()
x = T.matrix('x')
y = T.ivector('y')
Words = theano.shared(value=U, name="Words")
zero_vec_tensor = T.vector()
zero_vec = np.zeros(img_w)
set_zero = theano.function([zero_vec_tensor], updates=[(Words, T.set_subtensor(Words[0, :], zero_vec_tensor))],
                           allow_input_downcast=True)
layer0_input = Words[T.cast(x.flatten(), dtype="int32")].reshape((x.shape[0], 1, x.shape[1], Words.shape[1]))
conv_layers = []
layer1_inputs = []
for i in xrange(len(filter_hs)):
    filter_shape = filter_shapes[i]
    pool_size = pool_sizes[i]
    conv_layer = LeNetConvPoolLayer(rng, input=layer0_input, image_shape=(batch_size, 1, img_h, img_w),
                                    filter_shape=filter_shape, poolsize=pool_size, non_linear=conv_non_linear)
    layer1_input = conv_layer.output.flatten(2)
    conv_layers.append(conv_layer)
    layer1_inputs.append(layer1_input)
layer1_input = T.concatenate(layer1_inputs, 1)
hidden_units[0] = feature_maps*len(filter_hs)
classifier = MLPDropout(rng, input=layer1_input, layer_sizes=hidden_units, activations=activations,
                        dropout_rates=dropout_rate)

'''
define parameters of the model and update functions using adadelta
'''
params = classifier.params
for conv_layer in conv_layers:
    params += conv_layer.params
if non_static:
    # if word vectors are allowed to change, add them as model parameters
    params += [Words]
cost = classifier.negative_log_likelihood(y)
dropout_cost = classifier.dropout_negative_log_likelihood(y)
grad_updates = sgd_updates_adadelta(params, dropout_cost, lr_decay, 1e-6, sqr_norm_lim)

'''
load model parameters
'''
with open(params_file, 'rb') as f:
    tmp = cPickle.load(f)

for i in range(len(classifier.params)):
    classifier.params[i].set_value(tmp[i].get_value())

del tmp

'''
prepare datasets
2) split x and y axis
'''
np.random.seed(3435)

test_set_x = datasets[:, :img_h]
test_set_y = np.asarray(datasets[:, -1], "int32")

'''
models
'''
test_pred_layers = []
test_size = batch_size              # modified line

test_layer0_input = Words[T.cast(x.flatten(), dtype="int32")].reshape((test_size, 1, img_h, Words.shape[1]))
for conv_layer in conv_layers:
    test_layer0_output = conv_layer.predict(test_layer0_input, test_size)
    test_pred_layers.append(test_layer0_output.flatten(2))

test_layer1_input = T.concatenate(test_pred_layers, 1)
test_y_pred = classifier.predict(test_layer1_input)
test_y_pred_p = classifier.predict_p(test_layer1_input)
test_y_pred_p_reduce = test_y_pred_p[:, 0]
test_error = T.mean(T.neq(test_y_pred, y))
test_model_all = theano.function([x, y], test_error, allow_input_downcast=True)
test_predict = theano.function([x], test_y_pred, allow_input_downcast=True)
test_probs = theano.function([x], test_y_pred_p_reduce, allow_input_downcast=True)

`
Afterwards you only have to split into batches and perform the test_-functions :-)

from cnn_sentence.

Zero0one1 avatar Zero0one1 commented on September 26, 2024

@pexmar could you give examples of how to use these test_-functions? I have no idea about their functions. Thank you so much

from cnn_sentence.

moses9591 avatar moses9591 commented on September 26, 2024

@Zero0one1 Did you succeed to use these functions ?

from cnn_sentence.

Zero0one1 avatar Zero0one1 commented on September 26, 2024

@Zero0one1 Did you succeed to use these functions ?

No. I saved the model successfully but don't know how to predict new input. Do you have any idea?

from cnn_sentence.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.