I've implemented recursive net, and initialize sequencer with that. (also memory optim

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Question about Sequencer.lua about opennmt HOT 4 CLOSED

opennmt commented on May 20, 2024

Question about Sequencer.lua

from opennmt.

Comments (4)

guillaumekln commented on May 20, 2024 1

The difference is that you call backward on the RVNN module because it is the one exposed by the Sequencer. As you don't override the backward function, the definition from nn.Module is used:

function Module:backward(input, gradOutput, scale)
   scale = scale or 1
   self:updateGradInput(input, gradOutput)
   self:accGradParameters(input, gradOutput, scale)
   return self.gradInput
end

which expects self.gradInput to be not nil.

On the other hand, the LSTM module is not directly exposed by the Sequencer and it only relies on updateGradInput's return value. See https://github.com/torch/nngraph/blob/master/gmodule.lua#L420 which is called on each node in the graph.

However, these lines:

self.gradInput = self.net:updateGradInput(input, gradOutput)
return self.gradInput

should also appear in the LSTM module for consistency.

So thank you for your question.

from opennmt.

guillaumekln commented on May 20, 2024

How do you initialize the Sequencer?

from opennmt.

helson73 commented on May 20, 2024

@guillaumekln
I initialize sequencer like this

local Tree, parent = torch.class('onmt.Tree', 'onmt.Sequencer')

function Tree:__init (rvnn)
  self.rvnn = rvnn
  parent.__init(self, self.rvnn)
  self:resetPreallocation()
end

function Tree.load(pretrained)
  local self = torch.factory('onmt.Tree')
  self.rvnn = pretrained.modules[1]
  parent.__init(self, self.rvnn)
  self:resetPreallocation()
end

function Tree:training()
  parent.training(self)
end

function Tree:evaluate()
  parent.evaluate(self)
end

function Tree:serialize()
  return {
    modules = self.modules
  }
end

function Tree:maskPadding()
  self.maskPad = true
end

function Tree:resetPreallocation()
  self.headProto = torch.Tensor()
  self.depProto = torch.Tensor()
  self.gradFeedProto = torch.Tensor()
end

function Tree:forward(batch, f2s_)
  if self.train then
    self.inputs = {}
    self:_reset_noise()
  end

  local head_ = onmt.utils.Tensor.reuseTensor(self.headProto,
                                              {batch.size, self.rvnn.outSize})
  local dep_ = onmt.utils.Tensor.reuseTensor(self.depProto,
                                              {batch.size, self.rvnn.outSize})

  for t = 1, batch.headLength do
    onmt.utils.DepTree._get(head_, f2s_, batch.head[t])
    onmt.utils.DepTree._get(dep_, f2s_, batch.dep[t])
    local tree_input = {head_, dep_, batch.relation[t]}
    if self.train then
      self.inputs[t] = tree_input
    end
    onmt.utils.DepTree._set(f2s_, self:net(t):forward(tree_input), batch.update[t])
  end
  return f2s_
end

function Tree:backward(batch, gradFeedOutput)
  local gradFeed_ = onmt.utils.Tensor.reuseTensor(self.gradFeedProto,
                                                  {batch.size, self.rvnn.outSize})
  for t = batch.headLength, 1, -1 do
    onmt.utils.DepTree._get(gradFeed_, gradFeedOutput, batch.update[t])
    local dtree = self:net(t):backward(self.inputs[t], gradFeed_)
    onmt.utils.DepTree._add(gradFeedOutput, dtree[1], batch.head[t])
    onmt.utils.DepTree._add(gradFeedOutput, dtree[2], batch.dep[t])
    onmt.utils.DepTree._fill(gradFeedOutput, 0, batch.update[t])
  end
  return gradFeedOutput
end

from opennmt.

helson73 commented on May 20, 2024

@guillaumekln That's really helpful. Thanks!

from opennmt.

Question about Sequencer.lua about opennmt HOT 4 CLOSED

Comments (4)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent