Comments (2)
Hi! To clarify, are you looking for the tensor sequences of intermediate layers (for example, after the encoder?), or for an easier way to access the input and output tensors?
Intermediate layers and outputs are something that fairseq as a whole doesn't currently expose easily — you'd need to modify the transformer methods themselves to return intermediate layers you're interested in, and then you could return them through the hub interface. This is not something I plan to add in the near-term.
The hub_interface.py
code for visrep follows other fairseq interfaces — for example, check out the default fairseq HubInterface. The encode
function does tokenization and converts the input into the tensor which is passed to the model (which for this case means take a string and generates a tensor corresponding to the rendered image of it), and decode
detokenizes the output tensor from the model to a sentence string. The encoder and decoder aren't accessed separately, and the encoded tensors from VisualTextDataset are those returned by encode()
.
If you want the tensors for the model inputs and outputs, you can get the tensor sequence for the input by calling encode()
directly, and for the final output by modifying L62 in hub_interface.py
to return not just the output string:
return [self.decode(hypos[0]["tokens"]) for hypos in batched_hypos]
--> return [(self.decode(hypos[0]["tokens"]), hypos[0]) for hypos in batched_hypos]
where hypos[0] is the best output hypothesis chosen by beam search, and will look something like this:
[{'tokens': tensor([ 57, 8, 7, 11, 749, 5, 2]), 'score': tensor(-0.4421), 'attention': tensor([]), 'alignment': tensor([]), 'positional_scores': tensor([-1.6331, -0.2708, -0.1837, -0.4788, -0.0630, -0.2784, -0.1871])}, {'tokens': tensor([152, 8, 7, 11, 749, 5, 2]), 'score': tensor(-0.4499), 'attention': tensor([]), 'alignment': tensor([]), 'positional_scores': tensor([-1.6986, -0.3032, -0.1806, -0.4487, -0.0603, -0.2711, -0.1871])}, {'tokens': tensor([100, 16, 11, 749, 5, 2]), 'score': tensor(-0.4618), 'attention': tensor([]), 'alignment': tensor([]), 'positional_scores': tensor([-1.6489, -0.2012, -0.4071, -0.0563, -0.2692, -0.1882])}, {'tokens': tensor([ 14, 22, 8, 7, 11, 749, 5, 2]), 'score': tensor(-0.5500), 'attention': tensor([]), 'alignment': tensor([]), 'positional_scores': tensor([-2.5914, -0.3630, -0.2739, -0.1726, -0.4729, -0.0613, -0.2734, -0.1919])}, {'tokens': tensor([ 20, 13, 8, 7, 11, 749, 5, 2]), 'score': tensor(-0.6997), 'attention': tensor([]), 'alignment': tensor([]), 'positional_scores': tensor([-3.1400, -0.9755, -0.3081, -0.1903, -0.4878, -0.0541, -0.2568, -0.1851])}]
from visrep.
are you looking for the tensor sequences of intermediate layers (for example, after the encoder?), or for an easier way to access the input and output tensors?
I interested in intermediate layers and outputs of the model. I can use VisualTextTransformerEncoder
like this:
self.models[0].encoder(batch['net_input']['src_tokens'], batch['net_input']['src_lengths'])
The TransformerDecoder
is more complicated since it require prev_output_tokens
, which are created in SequenceGenerator
Anyway, thanks for your answer.
from visrep.
Related Issues (4)
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from visrep.