- Playing Atari with Deep Reinforcement Learning
- Human-level control through deep reinforcement learning
- Rainbow: Combining Improvements in Deep Reinforcement Learning
- Deep Reinforcement Learning with Double Q-learning (double)
- Dueling Network Architectures for Deep Reinforcement Learning (dueling)
- PRIORITIZED EXPERIENCE REPLAY (PER)
- Prioritized Sequence Experience Replay (PSER)
- Hindsight Experience Replay
- Massively Parallel Methods for Deep Reinforcement Learning (Gorila)
- Asynchronous Methods for Deep Reinforcement Learning (A3C)
- REINFORCEMENT LEARNING THROUGH ASYNCHRONOUS ADVANTAGE ACTOR-CRITIC ON A GPU (GA3C)
- EFFICIENT PARALLEL METHODS FOR DEEP REINFORCEMENT LEARNING (PAAC)
- ACCELERATED METHODS FOR DEEP REINFORCEMENT LEARNING
- IMPALA: Scalable Distributed Deep-RL with ImportanceWeighted Actor-Learner Architectures (IMPALA)
- Statistical Aspects of Wasserstein Distances
- The Cramer Distance as a Solution to Biased Wasserstein Gradients
- A Distributional Perspective on Reinforcement Learning (C51)
- An Analysis of Categorical Distributional Reinforcement Learning
- Distributional Reinforcement Learning with Quantile Regression (QR-DQN)
- Implicit Quantile Networks for Distributional Reinforcement Learning (IQN)
- Fully Parameterized Quantile Function for Distributional Reinforcement Learning (FQF)
- Statistics and Samples in Distributional Reinforcement Learning (ER-DQN)
- A distributional code for value in dopamine-based reinforcement learning
- QUOTA: The Quantile Option Architecture for Reinforcement Learning
- Deep Exploration via Bootstrapped DQN
- Parameter Space Noise for Exploration
- Noisy Networks for Exploration
- Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor
- Deterministic Policy Gradient Algorithms
- CONTINUOUS CONTROL WITH DEEP REINFORCEMENT LEARNING
- Distributed Distributional Deterministic Policy Gradients1
- Natural Gradient Works Efficiently in Learning 1
- A Natural Policy Gradient 1
- Approximately Optimal Approximate Reinforcement Learning 1
- New insights and perspectives on the natural gradient method
- Trust Region Policy Optimization
- High-Dimensional Continuous Control Using Generalized Advantage Estimation
- Generative Adversarial Imitation Learning
- Sample Efficient Actor-Critic with Experience Replay
- Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation
- TensorFlow Agents: Efficient Batched Reinforcement Learning in TensorFlow
- Proximal Policy Optimization Algorithms
- Emergence of Locomotion Behaviours in Rich Environments
- Implementation Matters in Deep Policy Gradients: A Case Study on PPO and TRPO
- CutOut - Improved Regularization of Convolutional Neural Networks with Cutout
- BN - Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
- LN - Layer Normalization
- IN - Instance Normalization: The Missing Ingredient for Fast Stylization
- GN - Group Normalization
- BIN - Batch-Instance Normalization for Adaptively Style-Invariant Neural Networks
- BRN - Batch Renormalization: Towards Reducing Minibatch Dependence in Batch-Normalized Models
- CBN - Cross-Iteration Batch Normalization
- SE - Squeeze-and-Excitation Networks
- CBAM - CBAM: Convolutional Block Attention Module
- AlexNet - ImageNet Classification with Deep Convolutional Neural Networks
- ZFNet - Visualizing and Understanding Convolutional Networks
- dropout - Improving neural networks by preventing co-adaptation of feature detectors
- maxout - Maxout Networks
- Network In Network
- VGG - Very Deep Convolutional Networks for Large-Scale Image Recognition
- CSP - CSPNet: A New Backbone that can Enhance Learning Capability of CNN
- SPP - Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition
- FPN - Feature Pyramid Networks for Object Detection
- PAN - Path Aggregation Network for Instance Segmentation
- Pose Machines: Articulated Pose Estimation via Inference Machines
- Convolutional Pose Machines
- Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields
- OpenPose: Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields
- PARE: Part Attention Regressor for 3D Human Body Estimation