In pre-training stage, latent code z is sampled according to the prior p(z), I noticed

what is the role of latent code? about ase HOT 7 OPEN

nv-tlabs commented on June 18, 2024

what is the role of latent code?

from ase.

Comments (7)

xjturobocon commented on June 18, 2024 2

So how does it have the structure? In pre-training, every step, a latent code is sampled randomly, so there is no guarantee that similar behaviors correspond to similar latent codes, is that right?

from ase.

xbpeng commented on June 18, 2024

Yes, the latents are sampled randomly during pre-training. Because of the objective used during pre-training, the model will learn to assign different behaviors to different latents automatically. This is similar to what happens in unsupervised reinforcement learning. We do not need to explicitly specify which skills a particular latent produces. Instead the GAN and unsupervised RL objective will automatically learn a skill embedding where different zs will be mapped to different behaviors that resemble the dataset. If you want more details, you can take a look at the paper for a more in-depth explanation.

from ase.

xjturobocon commented on June 18, 2024

Yes, the latents are sampled randomly during pre-training. Because of the objective used during pre-training, the model will learn to assign different behaviors to different latents automatically. This is similar to what happens in unsupervised reinforcement learning. We do not need to explicitly specify which skills a particular latent produces. Instead the GAN and unsupervised RL objective will automatically learn a skill embedding where different zs will be mapped to different behaviors that resemble the dataset. If you want more details, you can take a look at the paper for a more in-depth explanation.

Thanks for your reply. Assuming the latent code is 1-dim(range from 0 to 1), for a specific skill, for example, 'jump', when performing 'jump' skill in high level stage, it should get continuously changing zs (0.1, 0.11, 0.12...)for sequential frames, right? because for the task policy network, the input is continuous, the output zs is also continuous. However, during pre-training, for 'jump' skill, I mentioned that for every 10 frames, the sequence amp_obs is mapped with a random zs, so the motion clip can be mapped to distinct zs(0.1, 0.5, 0.9), I think it may hard to generate 'jump' skill stably.

I guess an ideal map between zs and skill may be that a skill(motion clip) should map with a zs cluster, not discrete different zs.
hope for your opinion!

from ase.

xbpeng commented on June 18, 2024

sorry not sure if i understand your question.

from ase.

xjturobocon commented on June 18, 2024

sorry i didn't ask clearly. What confused me is that after pre-training, is the latent space structured？Is it like the first picture or the second picture below? different color represents a specific skill.

from ase.

xbpeng commented on June 18, 2024

There is some structure in the latent space. Latents that are close in the latent space will typically correspond to similar behaviors.

from ase.

xbpeng commented on June 18, 2024

yes that's right. The latents are sampled randomly during training, and there's no guarantee that similar behaviors correspond to similar latent codes. But in practice, we do see that similar latent codes often lead to similar behaviors. This is likely partly due to the smoothness of the function approximator and the mutual information objective.

from ase.

what is the role of latent code? about ase HOT 7 OPEN

Comments (7)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent