Report Abuse

Basic Information

Pearl is an open-source, production-ready reinforcement learning agent library released by the Applied Reinforcement Learning team at Meta to help researchers and practitioners build customizable RL agents. It is designed for developing agents that prioritize cumulative long-term feedback and that can operate under limited observability, sparse feedback, and high stochasticity. The repository provides modular building blocks—policy learners, exploration modules, replay buffers, environment wrappers and action representation modules—so users can assemble agents for research experiments or real-world systems. The README includes installation instructions (pip install -e .), a quickstart example using a Gym environment, multiple tutorial notebooks (recommender, contextual bandits, Frozen Lake, DQN/DDQN, actor-critic with safety), and notes on adoption in recommender systems, auction bidding, and creative selection.

Links

App Details

Features
Pearl emphasizes modularity and production features suitable for real applications. Core features documented in the README include dynamic action spaces, offline reinforcement learning support, intelligent neural exploration, safe decision making modules, history summarization, and data-augmented replay buffers. The library exposes policy learners (e.g., DQN, actor-critic), action representation modules, and several replay buffer implementations. It includes utilities and example benchmark configurations for mix-and-match agent construction. The project added component serialization: components produce state dicts compatible with torch.save and torch.load, support for get_extra_state and set_extra_state for non-parameter attributes, and a required compare method for testing. The repository also provides tutorial notebooks and examples for common environments and tasks.
Use Cases
Pearl helps practitioners prototype, evaluate, and deploy reinforcement learning agents by providing reusable, interoperable components and production-oriented features. The modular design reduces engineering effort when experimenting with algorithms or when adapting agents to domain-specific needs like recommender systems, auction bidding, or creative selection. Tutorials and quickstart examples accelerate onboarding and reproducible experiments on classic environments such as CartPole, Frozen Lake, and contextual bandit datasets. Offline RL, dynamic action spaces, safe decision modules, and data augmentation make it practical for real-world workflows where training data is limited or actions change over time. Serialization and component comparison ease checkpointing, testing, and model interchange across experiments and deployments.

Please fill the required fields*