Report Abuse

Basic Information

This repository provides AI Scientist-v2, an end-to-end agentic system for automating scientific discovery. It is built to autonomously generate research hypotheses, design and run experiments, analyze results, and draft scientific manuscripts using a progressive agentic tree search overseen by an experiment manager agent. The workflow is two-stage: an ideation step converts a user-provided topic markdown into structured JSON research ideas, and a best-first tree search experimental pipeline explores candidate experiments in parallel according to parameters in bfts_config.yaml. The code is designed to run on Linux with NVIDIA GPUs, CUDA, and PyTorch and can call multiple LLM providers including OpenAI, Gemini, and Claude via AWS Bedrock. The README warns that the system executes LLM-written code and should be run in a controlled sandbox. Outputs include timestamped experiment logs, visualized search trees, and paper drafts produced by the pipeline.

Links

App Details

Features
The project includes scripts and configuration to run the full pipeline, notably perform_ideation_temp_free.py for idea generation and launch_scientist_bfts.py for experiments. The best-first tree search is configurable via bfts_config.yaml with parameters such as num_workers, steps, num_seeds, max_debug_depth, debug_prob, num_drafts and other agent/search settings. The repo integrates optional literature search via Semantic Scholar (S2_API_KEY) to check novelty and gather citations. It supports multiple LLM backends through environment variables (OPENAI_API_KEY, GEMINI_API_KEY) and optionally AWS credentials for Bedrock to use Claude. Installation notes include creating a conda environment, installing PyTorch with CUDA, and PDF/LaTeX tooling. The tree search implementation builds on an external AIDE component and the pipeline emits JSON idea files, experiment logs, and a unified_tree_viz.html visualization.
Use Cases
AI Scientist-v2 helps researchers and developers automate and scale exploratory scientific workflows by converting high-level topic descriptions into concrete, testable research directions and then attempting those experiments autonomously. It reduces manual iteration by using LLMs to brainstorm, refine, and propose experimental code and by running a configurable best-first tree search to explore multiple hypotheses in parallel. The system produces structured outputs for inspection including idea JSONs, experiment logs, visualization of the search tree, aggregated plots and draft manuscripts, which speeds prototyping and reproducibility. The README provides example commands, model choices and estimated cost ranges per run to help users plan runs. Because the system executes generated code, it is particularly useful in controlled, sandboxed environments for risk-managed experimentation.

Please fill the required fields*