ai pdf chatbot langchain

Report Abuse

Basic Information

This monorepo is a customizable developer template for building an AI chatbot agent that ingests PDF documents, stores vector embeddings, and answers user queries using LangChain and LangGraph. It provides a Node.js/TypeScript backend with defined LangGraph graphs for ingestion and retrieval and a Next.js React frontend for file uploads, real-time chat and streaming responses. The project demonstrates using Supabase as a vector store and OpenAI (or other LLMs supported by LangChain) for language modeling. It is organized as a Turborepo monorepo with example configuration and environment variable files for local development, and it accompanies the Learning LangChain book as a practical example. The repository is intended for developers who want a ready-made, extendable pipeline for document QA agents and for those learning to orchestrate agent workflows with LangGraph and LangChain.

Links

Categorization

App Details

Features
Includes a Document Ingestion Graph to parse PDFs into Document objects and store embeddings in Supabase and a Retrieval Graph to handle questions, decide when to retrieve documents, and generate answers with source references. Supports streaming server-sent events so responses appear in real time in the UI. Integrates LangGraph for state-machine style orchestration and visual debugging of agentic workflows. Uses Next.js frontend with React and Tailwind for uploads and chat. Provides example backend files such as src/ingestion_graph.ts, src/retrieval_graph.ts and shared configuration files to change vector stores, retriever settings and prompts. Ships with environment examples for frontend and backend, instructions for running langgraph:dev and yarn dev, and guidance for deploying both backend and frontend.
Use Cases
The repo accelerates building a document question-answering assistant by providing end-to-end ingestion, storage and retrieval patterns developers can reuse. It shows how to extract text from PDFs, create and persist embeddings in Supabase, and route queries through LangGraph graphs to an LLM for concise, referenced answers. Real-time streaming improves interactivity while LangGraph studio and optional LangChain tracing help debug workflows. Configuration points and code locations are documented so teams can swap vector stores, change prompts, modify k-values and add retrievers or alternate LLM providers. The frontend keeps chat state per session and the backend persists ingested documents, making it a practical starting point for production deployments or experiments with document-backed agents.

Please fill the required fields*