Abstract: In human-robot interaction or more generally multi-agent systems, we often have decentralized agents that need to perform a task together. In such settings, it is crucial to have the ability to anticipate the actions of other agents. Without this ability, the agents are often doomed to perform very poorly. Humans are usually good at this, and it is mostly because we can have good estimates of what other agents are trying to do. We want to give such an ability to robots through reward learning and partner modeling. In this talk, I am going to talk about active learning approaches to this problem and how we can leverage preference data to learn objectives. I am going to show how preferences can help reward learning in the settings where demonstration data may fail, and how partner-modeling enables decentralized agents to cooperate efficiently.
Bio: Erdem Bıyık is a fifth-year Ph.D. candidate in the Electrical Engineering department at Stanford. He has received his B.Sc. degree from Bilkent University, Turkey, in 2017; and M.Sc. degree from Stanford University in 2019. He is interested in enabling robots to actively learn from various forms of human feedback and designing altruistic robot policies to improve the efficiency of multi-agent systems both in cooperative and competitive settings. He also worked at Google as a research intern in 2021 where he adapted his active robot learning algorithms to recommender systems.