Coder Social home page Coder Social logo

Comments (7)

PaParaZz1 avatar PaParaZz1 commented on August 22, 2024

random_collect_size means collecting n_sample data, so it is not compatible with episode collector. Why do you need to collect data by episode? If necessary, we will add n_episode option in this function.

from di-engine.

tianhan4 avatar tianhan4 commented on August 22, 2024

I do not intend to use random collect, however, random_collect_size is hardcoded in the 'serial_entry'. Thus I think that I either need to write my own entry function or delete the default random_collect_size parameter in the policies.

from di-engine.

PaParaZz1 avatar PaParaZz1 commented on August 22, 2024

OK, you can set random_collect_size=0 in your config to disable this attribute.

BTW, we set random_collect_size=1000 in SACPolicy default config so that it would be more steady in the early phase of off-policy training, such as some mujoco environments, and you can adjust it for your own environment.

from di-engine.

PaParaZz1 avatar PaParaZz1 commented on August 22, 2024

This comparison experiment shows the importance of random_collect_size in mujoco halfcheetah environment:
random_collect_size=10000 VS random_collect_size=0
Screen Shot 2022-01-16 at 9 30 25 PM

from di-engine.

tianhan4 avatar tianhan4 commented on August 22, 2024

Ok, thank you! I use n_episode because my environment is a real-time environment(congestion control) and cannot pause step by step. Thus, I personally think it's useful to transform the random_collect_size parameter to the number of episodes when n_episode is used.

from di-engine.

tianhan4 avatar tianhan4 commented on August 22, 2024

FYI. For myself, I implemented a new noise generator to generate random noise and Gaussian noise in the first steps to fit my need. Thus it's irrelevant to whether I use n_episode or n_sample.
Also, I inherited a new base noise class because I want to pass the action as an argument to generate some action-dependent noise. For example, pure random output.


class BaseNoise2(ABC):
    def __init__(self) -> None:
        super().__init__()

    @abstractmethod
    def __call__(self, action: torch.Tensor, shape: tuple, device: str) -> torch.Tensor:
        raise NotImplementedError

class HybridNoise(BaseNoise2):
    def __init__(self, mu: float = 0.0, sigma: float = 1.0,
                 noise_exp: int = 10000, random_exp: int = 10000, noise_end: float = 0.05) -> None:

        super(HybridNoise, self).__init__()
        self._mu = mu
        assert sigma >= 0, "GaussianNoise's sigma should be positive."
        self._sigma = sigma
        self._noise_exp = noise_exp
        self._random_exp = random_exp
        self._noise_end = noise_end
        self._noise_interval = sigma - noise_end
        self._random_step = random_exp
        assert self._noise_interval >= 0, "HybridNoise's sigma should be larger than the end sigma."
    
    def __call__(self, action: torch.Tensor, shape: tuple, device: str) -> torch.Tensor:
        if self._random_step >= 0:
            self._random_step -= 1
        random_noise = torch.rand(shape, device=device) * 2 - 1
        if np.random.uniform(0,1) < self._random_step/float(self._random_exp):
            return random_noise - action
        else:
            if self._sigma <= self._noise_end:
                pass
            else:
                self._sigma -= self._noise_interval/self._noise_exp
            gaussian_noise = torch.randn(shape, device=device)
            gaussian_noise = gaussian_noise * self._sigma + self._mu
            return gaussian_noise

from di-engine.

PaParaZz1 avatar PaParaZz1 commented on August 22, 2024

Ok, thank you! I use n_episode because my environment is a real-time environment(congestion control) and cannot pause step by step. Thus, I personally think it's useful to transform the random_collect_size parameter to the number of episodes when n_episode is used.

We have fixed this problem in #190, and you can use episode collector for random collecting in the beginning of training.

from di-engine.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.