PER was reported to cause issues (decreasing the performance of a DQN) when ported to

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Unit test Prioritised Experience Replay Memory about rainbow HOT 5 CLOSED

kaixhin commented on July 21, 2024

Unit test Prioritised Experience Replay Memory

from rainbow.

Comments (5)

Ashutosh-Adhikari commented on July 21, 2024

I am not sure whether what I am going to say is the correct logic behind PER or not.

What current code does : In the training loop, when we do mem.append(), we are keeping the priority to be some default priority, transitions.max().

Shouldn't we do this? : Calculate the priority before appending, and append with that priority. This will keep the complexity same. And attach the priority to the sample right away.

Such level of specification is not found in the paper, to the best of my knowledge.

from rainbow.

Kaixhin commented on July 21, 2024

Adding new transitions with the max priority is in line 6 of the algorithm in the PER paper; the initial value, 1, is given in line 2. Also, calculating the priority means having access to the future states (even more states when calculating multi-step returns) and doing the whole target calculation on a single sample, so it's not that cheap.

from rainbow.

marintoro commented on July 21, 2024

Just read that in the paper DISTRIBUTED PRIORITIZED EXPERIENCE REPLAY from D. Horgan.
"In Prioritized DQN (Schaul et al., 2016) priorities for new transitions were initialized to the maximum priority seen so far, and only updated once they were sampled."

But it's interesting to notice that they changed it cause this was not scaling well (this article is all about learning with a lot of different actors).

from rainbow.

Ashutosh-Adhikari commented on July 21, 2024

@Kaixhin Yep, I understand that now when you say so about n-step TD.

from rainbow.

Kaixhin commented on July 21, 2024

Results on 3 games so far look promising, so closing unless a specific problem is identified.

from rainbow.

Recommend Projects

Unit test Prioritised Experience Replay Memory about rainbow HOT 5 CLOSED

Comments (5)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent