GoogleDeepMind Introduces Scalable Inverse Reinforcement Learning for Language Models

    GoogleDeepMind Introduces Scalable Inverse Reinforcement Learning for Language Models

    Sep 5, 2024, 09:23 AM

    DeepMind has introduced a new approach to language model training using Scalable Inverse Reinforcement Learning (IRL). This method presents an effective alternative to traditional supervised Maximum Likelihood Estimation (MLE) in the fine-tuning pipeline, resulting in more robust reward functions and increased performance and diversity of model generations. The foundation of this approach lies in imitation learning, which is considered a reinforcement learning problem. Compared to supervised learning, IRL better exploits sequential structure, online data, and further extracts rewards. The insights were shared in a recent paper by GoogleDeepMind.

    View original story

    Markets

    Only registered users can request markets

    Browse all stories on DeepNewz

    Visit