We have two papers accepted to ICRA 2023 this year!
The first paper is with a PhD student in my current group at CMU, Yue (Sophie) Guo. This is a really interesting paper in which we explore transfer learning in reinforcement learning by asking the following question: can we increase the effectiveness of transferred knowledge by having the “teacher” produce explanations for why knowledge is important. We model transfer learning in the teacher-student in which the teacher gives advice to the student in the form of state-action pairs, e.g. in this state you should perform this action. When the teacher provides explanations for the advice, the student can internalize them and then self-apply this advice again in the future, providing a mechanism for self-correction!
Explainable Action Advising for Multi-Agent Reinforcement Learning
Yue Guo, Joseph Campbell, Simon Stepputtis, Ruiyu Li, Dana Hughes, Fei Fang, Katia Sycara
Abstract: Action advising is a knowledge transfer technique for reinforcement learning based on the teacher-student paradigm. An expert teacher provides advice to a student during training in order to improve the student’s sample efficiency and policy performance. Such advice is commonly given in the form of state-action pairs. However, it makes it difficult for the student to reason with and apply to novel states. We introduce Explainable Action Advising, in which the teacher provides action advice as well as associated explanations indicating why the action was chosen. This allows the student to self-reflect on what it has learned, enabling advice generalization and leading to improved sample efficiency and learning performance – even in environments where the teacher is sub-optimal. We empirically show that our framework is effective in both single-agent and multi-agent scenarios, yielding improved policy returns and convergence rates when compared to state-of-the-art methods.
The second paper is a collaboration with my old PhD lab at ASU, in which Michael Drolet continued my work on hugging robots. After transforming our Baxter robot into a giant teddy bear(!), Michael developed a variant of ensemble Bayesian Interaction Primitives that sub-divides the ensemble into multiple classes, representing different hugging types in this work. At run-time, this method detects the most likely hugging type that the human partner is executing (based on linear discriminant analysis) in order to perform an appropriate response. This setup enables fluid transitions between one hug type to the other, even though such transitions were never seen during training!
Learning and Blending Robot Hugging Behaviors in Time and Space
Michael Drolet, Joseph Campbell, Heni Ben Amor
Abstract: We introduce an imitation learning-based physical human-robot interaction algorithm capable of predicting appropriate robot responses in complex interactions involving a superposition of multiple interactions. Our proposed algorithm, Blending Bayesian Interaction Primitives (B-BIP) allows us to achieve responsive interactions in complex hugging scenarios, capable of reciprocating and adapting to a hug’s motion and timing. We show that this algorithm is a generalization of prior work, for which the original formulation reduces to the particular case of a single interaction, and evaluate our method through both an extensive user study and empirical experiments. Our algorithm yields significantly better quantitative prediction error and more-favorable participant responses with respect to accuracy, responsiveness, and timing, when compared to existing state-of-the-art methods.