Inverse Reinforcement Learning
Summary
Inverse Reinforcement Learning (IRL) is a subfield of machine learning that aims to infer the underlying reward function of an agent based on observations of its behavior. The key challenge in IRL is that multiple reward functions can explain the same observed behavior, making the problem ill-posed. Recent approaches to IRL have focused on addressing this ambiguity and improving scalability to more complex environments. Methods like cooperative IRL formulate the problem as a two-player game between human and AI to enable active learning. Adversarial IRL frameworks aim to learn robust rewards that transfer across environments. Other techniques leverage concepts from positive-unlabeled learning, Bayesian optimization, and maximum causal entropy to efficiently explore the space of possible reward functions. Multi-task and meta-learning extensions allow IRL to generalize across related tasks. Overall, modern IRL approaches are enabling more accurate inference of human preferences and values from demonstrations, with applications in value alignment, robotic learning from humans, and building reward models for reinforcement learning.
Research Papers
- An Efficient, Generalized Bellman Update For Cooperative Inverse Reinforcement Learning
- Learning Robust Rewards with Adversarial Inverse Reinforcement Learning
- IQ-Learn Inverse soft-Q Learning for Imitation
- Positive-Unlabeled Reward Learning
- A Survey of Inverse Reinforcement Learning Challenges, Methods and Progress
- Generalized Hindsight for Reinforcement Learning
- Cooperative Inverse Reinforcement Learning
- Online Bayesian Goal Inference for Boundedly-Rational Planning Agents
- Preferences Implicit in the State of the World
- Efficient Exploration of Reward Functions in Inverse Reinforcement Learning via Bayesian Optimization
- Enabling Robots to Communicate their Objectives
- Where Do You Think You’re Going? Inferring Beliefs about Dynamics from Behavior
- Multi-task Maximum Entropy Inverse Reinforcement Learning
- Learning Human Objectives by Evaluating Hypothetical Behavior
- Multi-agent Inverse Reinforcement Learning for Certain General-sum Stochastic Games