LlamaGym
Basic Information
LlamaGym is a small Python library designed to simplify fine-tuning large language model agents using online reinforcement learning within Gym-style environments. It provides an Agent abstract class that centralizes handling of LLM conversation context, episode batching, reward assignment and integration with RL training loops so developers can more quickly iterate on agent prompting and hyperparameters. The README includes example usage showing how to implement three abstract methods to format observations, provide a system prompt, and extract actions, and demonstrates a typical RL loop that calls act, assign_reward and terminate_episode. The project is explicitly focused on making experimentation with online RL for LLM-based agents easier rather than providing a highly optimized production RL system.