Mathieu Reymond


Postdoctoral researcher at Mila.

About me

I am a postdoctoral researcher at Mila - the Quebec AI Institute working with Prof. Sarath Chandar. I completed my PhD at the Vrije Universiteit Brussel under the supervision of Prof. Ann Nowé and Diederik M. Roijers, where I focused on incorporating knowledge about the decision maker in multi-objective reinforcement learning. My current research focuses on reinforcement learning for scientific discovery through efficient exploration.


  1. [link] Reymond, M., Bargiacchi, E., Roijers, D. M., & Nowé, A. (to appearto appear). Interactively learning the user’s utility for best-arm identification in multi-objective multi-armed bandits. International Conference on Autonomous Agents and Multiagent Systems, AAMAS 2024. ./assets/pdf/mcbul.pdf
  2. [link] Avalos, R., Reymond, M., Nowe, A., & Roijers, D. M. (2023). Local Advantage Networks for Multi-Agent Reinforcement Learning in Dec-POMDPs. Transactions on Machine Learning Research.
  3. [link] Reymond, M., Delgrange, F., Nowe, A., & Pérez, G. A. (2023). WAE-PCN: Wasserstein-autoencoded Pareto Conditioned Networks. In F. Cruz, C. F. Hayes , C. Wang, & C. Yates (Eds.), Proc. of the Adaptive and Learning Agents Workshop (ALA 2023): Vol. (15th ed., pp. 1–7).
  4. Reymond, M., Hayes, C. F., Steckelmacher, D., Roijers, D. M., & Nowe, A. (2023). Actor-critic multi-objective reinforcement learning for non-linear utility functions. Autonomous Agents and Multi-Agent Systems, 37(2).
  5. [link] Reymond, M., Hayes, C. F., Willem, L., Radulescu, R., Abrams, S., Roijers, D. M., Howley, E., Mannion, P., Hens, N., Nowe, A., & Libin, P. (2022, September 19). Exploring the Pareto front of multi-objective COVID-19 mitigation policies using reinforcement learning.
  6. [link] Avalos, R., Reymond, M., Nowe, A., & Roijers, D. M. (2022). Local Advantage Networks for Cooperative Multi-Agent Reinforcement Learning. International Conference on Autonomous Agents and Multiagent Systems, AAMAS 2022, 1524–1526.
  7. [link] Wang, S., Reymond, M., Irissappane, A. A., & Roijers, D. M. (2022). Near On-Policy Experience Sampling in Multi-Objective Reinforcement Learning. The 21st International Conference on Autonomous Agents and Multiagent Systems, 1756–1758.
  8. [link] Reymond, M., Bargiacchi, E., & Nowe, A. (2022). Pareto Conditioned Networks. The 21st International Conference on Autonomous Agents and Multiagent Systems, 1110–1118.
  9. Hayes, C. F., Radulescu, R., Bargiacchi, E., Källström, J., Macfarlane, M., Reymond, M., Verstraeten, T., Zintgraf, L., Dazeley, R., Heintz, F., Howley, E., Irissappane, A. A., Mannion, P., Nowe, A., De Oliveira Ramos, G., Restelli, M., Vamplew, P., & Roijers, D. M. (2022). A Practical Guide to Multi-Objective Reinforcement Learning and Planning. Autonomous Agents and Multi-Agent Systems, 36(1).
  10. Avalos, R., Reymond, M., Nowe, A., & Roijers, D. M. (2022). Local Advantage Networks for Multi-Agent Reinforcement Learning in Dec-POMDPs. Proc. of the Adaptive and Learning Agents Workshop (ALA 2023),, 1–17.
  11. [link] Reymond, M., Hayes, C. F., Roijers, D. M., Steckelmacher, D., & Nowe, A. (2021, July 14). Actor-Critic Multi-Objective Reinforcement Learning for Non-Linear Utility Functions.
  12. [link] Hayes, C. F., Reymond, M., Roijers, D. M., Howley, E., & Mannion, P. (2021). Distributional Monte Carlo Tree Search for Risk-Aware and Multi-Objective Reinforcement Learning. The 20th International Conference on Autonomous Agents and Multiagent Systems, 1518–1520.
  13. Roijers, D. M., Zintgraf, L. M., Libin, P., Reymond, M., Bargiacchi, E., & Nowe, A. (2021). Interactive Multi-Objective Reinforcement Learning in Multi-Armed Bandits with Gaussian Process Utility Models. In F. Hutter, K. Kersting, J. Lijffijt, & I. Valera (Eds.), ECML-PKDD 2020: Proceedings of the 2020 European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases. Springer.
  14. [link] Reymond, M., & Nowe, A. (2019, May 13). Pareto-DQN: Approximating the Pareto front in complex multi-objective decision problems. Proceedings of the Adaptive and Learning Agents Workshop 2019 (ALA-19) at AAMAS.
  15. [link] Nevens, J., Radulescu, R., Reymond, M., Van Eecke, P., Efthymiadis, K., & Beuls, K. (2018). Hybrid AI for Visual Question Answering on CLEVR. In BNAIC 2018 Preproceedings (pp. 171–172).
  16. [link] Reymond, M., Patyn, C., Radulescu, R., Nowe, A., & Deconinck, G. (2018). Reinforcement Learning for Demand Response of Domestic Household Appliances. Proceedings of the Adaptive Learning Agents Workshop 2018 (ALA-18), 18–25.