CrystalGym: A New Benchmark for Materials Discovery Using Reinforcement Learning
A Generalist Hanabi Agent
Divide and Conquer: Provably Unveiling the Pareto Front with Multi-Objective Reinforcement Learning
Interactively learning the user's utility for best-arm identification in multi-objective multi-armed bandits
Exploring the Pareto front of multi-objective COVID-19 mitigation policies using reinforcement learning
Local Advantage Networks for Multi-Agent Reinforcement Learning in Dec-POMDPs
WAE-PCN: Wasserstein-autoencoded Pareto Conditioned Networks
Actor-critic multi-objective reinforcement learning for non-linear utility functions
Monte Carlo tree search algorithms for risk-aware and multi-objective reinforcement learning
Local Advantage Networks for Multi-Agent Reinforcement Learning in Dec-POMDPs
Near On-Policy Experience Sampling in Multi-Objective Reinforcement Learning
Pareto Conditioned Networks
A Practical Guide to Multi-Objective Reinforcement Learning and Planning
Actor-Critic Multi-Objective Reinforcement Learning for Non-Linear Utility Functions
Distributional Monte Carlo Tree Search for Risk-Aware and Multi-Objective Reinforcement Learning
Interactive Multi-Objective Reinforcement Learning in Multi-Armed Bandits with Gaussian Process Utility Models
Pareto-DQN: Approximating the Pareto front in complex multi-objective decision problems
Reinforcement Learning for Demand Response of Domestic Household Appliances