reflexion

New

This repository contains the code, demonstration notebooks, and logged experiment outputs for the NeurIPS 2023 paper "Reflexion: Language Agents with Verbal Reinforcement Learning." It is organized to reproduce and explore experiments in three domains described in the paper: reasoning (HotPotQA), decision-making (AlfWorld), and programming. The materials include notebooks that run agent variants, shell scripts to launch iterative AlfWorld trials, and recorded runs and logs for prior experiments. Setup instructions show how to install required Python dependencies and configure an OpenAI API key. The project exposes configurable agent types and reflexion strategies and stores outputs in structured log directories so researchers and developers can inspect reasoning traces, self-reflections, and trial-level results without rerunning costly API experiments.

Stars

2832

App URL

https://github.com/noahshinn/reflexion

Github Repository

https://github.com/noahshinn/reflexion/blob/main/README.md

Features

Provides interactive notebooks for HotPotQA reasoning experiments and separate directories for AlfWorld decision-making and programming runs. Includes predefined agent types such as ReAct and chain-of-thought variants and an Enum of reflexion strategies: NONE, LAST_ATTEMPT, REFLEXION, and LAST_ATTEMPT_AND_REFLEXION. Offers shell tooling (run_reflexion.sh) with parameters for num_trials, num_envs, run_name, use_memory, resume options, and logging locations. Contains example logs and root directories for reproducing reported runs so users can inspect prior outputs. Contains figures illustrating the Reflexion approach and references to other implementations and related resources. Provides a requirements.txt for dependency installation and instructions to set the OPENAI_API_KEY environment variable.

Use Cases

This repository helps researchers and developers reproduce and analyze experiments on language agents that learn via verbal self-reflection. Users can run notebooks to sample HotPotQA questions, compare agent types and reflexion strategies, and explore recorded reasoning traces and self-reflections to understand failure modes and improvement patterns. The AlfWorld scripts let users run iterative decision-making trials with options to enable persistent memory for storing reflections, resume runs, and tune trial and environment counts. Logged outputs allow offline analysis of agent behavior without rerunning costly API calls. The README also documents practical constraints such as limited access to high-capacity models and API costs and provides a citation and contact for follow-up.

reflexion

Basic Information

Links

Categorization

App Details

Categories

Similar Listings

virtual lab

mnemo

EdgeChains

RAG Agents Accelerator

LLM Zero to Hundred

Featured Listings

Terry Bison Ranch

The Singapore Flyer

Tags

More Filters

reflexion

Categories

Similar Listings

virtual lab

mnemo

EdgeChains

RAG Agents Accelerator

LLM Zero to Hundred

Featured Listings

Terry Bison Ranch

The Singapore Flyer

Tags