Comments (13)
Set state['rec_layer'] = 'RecurrentLayer'
Then main parts of the coded affected would be:
Word Embedding
emb_words = MultiLayer(
rng,
n_in=state['n_in'],
n_hids=eval(state['inp_nhids']),
activation=eval(state['inp_activ']),
init_fn='sample_weights_classic',
weight_noise=state['weight_noise'],
rank_n_approx = state['rank_n_approx'],
scale=state['inp_scale'],
sparsity=state['inp_sparse'],
learn_bias = True,
bias_scale=eval(state['inp_bias']),
name='emb_words')
#### Recurrent Layer
rec = eval(state['rec_layer'])(
rng,
eval(state['nhids']),
activation = eval(state['rec_activ']),
bias_scale = eval(state['rec_bias']),
scale=eval(state['rec_scale']),
sparsity=eval(state['rec_sparse']),
init_fn=eval(state['rec_init']),
weight_noise=state['weight_noise'],
name='rec')
#### Stiching them together
##### (1) Get the embedding of a word
x_emb = emb_words(x, no_noise_bias=state['no_noise_bias'])
##### (2) Embedding + Hidden State via Recurrent Layer
reset = TT.scalar('reset')
rec_layer = rec(x_emb,
no_noise_bias=state['no_noise_bias'],
truncate_gradient=state['truncate_gradient'],
batch_size=state['bs'])
#### Softmax Layer
output_layer = SoftmaxLayer(
rng,
eval(state['nhids']),
state['n_out'],
scale=state['out_scale'],
bias_scale=state['out_bias_scale'],
init_fn="sample_weights_classic",
weight_noise=state['weight_noise'],
sparsity=state['out_sparse'],
sum_over_time=True,
name='out')
There might be something else you might need to change, but you might to
figure it out.
Toms Bergmanis
On 16 October 2014 16:28, mkudinov [email protected] wrote:
I want to train a simple shallow RNN with 1 hidden layer (like in
Mikolov's work).
In this case I don't need an embedding layer and I would like to turn it
off.
What is the right way to do it? I tried to do this:
rec_layer = rec(x, n_steps=x.shape[0],
init_state=h0*reset,
no_noise_bias=state['no_noise_bias'],
truncate_gradient=state['truncate_gradient'],
batch_size=1)
but it, obviously, didn't worrk.
I'm completely new at Teano so I didn't succeed in guessing it just
looking through the code.P.S. Obviously, it is not a bug and should not be here but I don't have
another way of communication with you.—
Reply to this email directly or view it on GitHub
#5.
from groundhog.
From Toms' example: you can ignore emb_words part, and feed 'x' directly to 'rec' as well. This will be closer to what Mikolov uses.
from groundhog.
Would that make training simpler?
Sent from my BlackBerry® smartphone
-----Original Message-----
From: Kyunghyun Cho [email protected]
Date: Thu, 16 Oct 2014 09:27:19
To: pascanur/[email protected]
Reply-To: pascanur/GroundHog [email protected]
Cc: [email protected]
Subject: Re: [GroundHog] Shallow RNN in GroundHog (#5)
From Toms' example: you can ignore emb_words part, and feed 'x' directly to 'rec' as well. This will be closer to what Mikolov uses.
Reply to this email directly or view it on GitHub:
#5 (comment)
from groundhog.
The size of embedding matrix (in the latter case), O(H x V). If your hidden
state H and the vocabulary size V are large, the size of the embedding
matrix grows quite large. Instead, you can do lower-rank approximation by
having an embedding layer M, resulting in the number of parameters being
O(H x M + M x V).
This was our justification for this approach, but it's not necessary, as
long as you have enough data and moderately sized hidden state.
On Thu, Oct 16, 2014 at 12:29 PM, tomsbergmanis [email protected]
wrote:
Would that make training simpler?
Sent from my BlackBerry® smartphone-----Original Message-----
From: Kyunghyun Cho [email protected]
Date: Thu, 16 Oct 2014 09:27:19
To: pascanur/[email protected]
Reply-To: pascanur/GroundHog [email protected]
Cc: [email protected]
Subject: Re: [GroundHog] Shallow RNN in GroundHog (#5)From Toms' example: you can ignore emb_words part, and feed 'x' directly
to 'rec' as well. This will be closer to what Mikolov uses.
Reply to this email directly or view it on GitHub:
#5 (comment)—
Reply to this email directly or view it on GitHub
#5 (comment).
from groundhog.
@kyunghyuncho In this case I get the following error:
Traceback (most recent call last):
File "mikolovStyle.py", line 416, in
jobman(state, None)
File "mikolovStyle.py", line 134, in jobman
name='rec')
File "/home/mkudinov/workspace/GroundHog-master/groundhog/layers/rec_layers.py", line 974, in init
self._init_params()
File "/home/mkudinov/workspace/GroundHog-master/groundhog/layers/rec_layers.py", line 1007, in init_params
self.nG_hh = theano.shared(self.G_hh.get_value()*0, name='noise'+self.G_hh.name)
AttributeError: 'RecurrentLayer' object has no attribute 'G_hh'
What is update gate?
from groundhog.
Sorry about the late reply (I'm travelling now.)
Can you replace the following lines
self.nW_hh = theano.shared(self.W_hh.get_value()*0, name='noise_'+self.W_hh.name)
self.nG_hh = theano.shared(self.G_hh.get_value()*0, name='noise_'+self.G_hh.name)
self.noise_params = [self.nW_hh, self.nG_hh]
to
self.nW_hh = theano.shared(self.W_hh.get_value()*0, name='noise_'+self.W_hh.name)
self.noise_params = [self.nW_hh]
if self.gating:
self.nG_hh = theano.shared(self.G_hh.get_value()*0, name='noise_'+self.G_hh.name)
self.noise_params += [self.nG_hh]
if self.reseting:
self.nR_hh = theano.shared(self.R_hh.get_value()*0, name='noise_'+self.R_hh.name)
self.noise_params += [self.nR_hh]
and try again?
If it works for you, I'll commit my changes.
from groundhog.
I made the change and now I get
Original exception was:
Traceback (most recent call last):
File "scripts/mikolovStyle.py", line 415, in
jobman(state, None)
File "scripts/mikolovStyle.py", line 144, in jobman
batch_size=state['bs'])
File "/home/mkudinov/workspace/GroundHog-master/groundhog/layers/basic.py", line 464, in call
new_obj.fprop(_args, *_kwargs)
File "/home/mkudinov/workspace/GroundHog-master/groundhog/layers/rec_layers.py", line 1170, in fprop
n_steps = nsteps)
File "/usr/local/lib/python2.7/dist-packages/theano/scan_module/scan.py", line 1007, in scan
scan_outs = local_op(_scan_inputs)
File "/usr/local/lib/python2.7/dist-packages/theano/gof/op.py", line 399, in call
node = self.make_node(_inputs, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/theano/scan_module/scan_op.py", line 370, in make_node
inner_sitsot_out.type.ndim))
ValueError: When compiling the inner function of scan the following error has been encountered: The initial state (outputs_info in scan nomenclature) of variable IncSubtensor{Set;:int64:}.0 (argument number 1) has dtype float32 and 2 dimension(s), while the result of the inner function for this output has dtype float64 and 1 dimension(s). This could happen if the inner graph of scan results in an upcast or downcast. Please make sure that you use dtypes consistently
from groundhog.
Currently, GroundHog only supports single-precision floating point variables (float32). When you run your script, you should explicitly set floatX to 'float32' in Theano configuration variables:
> THEANO_FLAGS=device=gpu,floatX=float32 python your_script_name.py
from groundhog.
I set it through .theanorc.
print theano.config.floatX
float32
from groundhog.
I didn't follow the whole discussion. But probably somewhere in your code,
you forget to do casting. You need to do casting on every variable that you
change the type of. For example if you don't explicitly, cast the output of
multiplicion of a float32 variable with int128 variable, the result will be
float64,.
On Sun, Oct 19, 2014 at 2:49 PM, mkudinov [email protected] wrote:
I set it through .theanorc.
print theano.config.floatX
float32—
Reply to this email directly or view it on GitHub
#5 (comment).
Caglar GULCEHRE
from groundhog.
I solved the problem. It was caused by implicit conversion inside RecurrentLayer.step_fprop(). dtype of input vector was set to int64. Inside RecurrentLayer.step_fprop() there is a line:
preactiv = TT.dot(state_before_, W_hh) + state_below
adding
preactiv = TT.cast(preactiv,theano.config.floatX)
solves the problem
but
It means that in recurrent layer input is simply added to the previous hidden. It is not the same as Mikolov does, so the 1-layer embedding is required. I.e. tomsbergmanis was right isn't it?
from groundhog.
You're right. I've somehow mistaken about this whole thing.
Can you make a pull request for the casting code there? Or, I can make a direct change later.
from groundhog.
I'll try.
from groundhog.
Related Issues (7)
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from groundhog.