second-brain-agent

Report Abuse

Basic Information

Second Brain AI Agent is a personal knowledge management system that automatically indexes and lets you interact with a directory of markdown notes and the links they contain. It extracts text from markdown files and from linked resources such as PDFs, web pages, and YouTube videos, breaks those texts into chunks, creates vector embeddings, stores them in a ChromaDB vector store, and uses LangChain plus an OpenAI large language model to answer questions about your content. The project implements a pipeline composed of transform_md.py and transform_txt.py for ingestion and chunking, a retrieval-augmented generation workflow with intent detection and specialized chains for summaries, activity reports and regular questions, and a Streamlit web UI alongside CLI utilities and optional systemd services for continuous processing. Environment variables control organization mapping and domain filtering.

Links

Categorization

App Details

Features
Automated indexing of markdown notes and linked resources including PDFs, web pages and YouTube video transcripts. Text extraction and history-aware splitting for journal files with date-based sections. Chunking and embedding creation via transform_txt.py and storage in a ChromaDB vector store. Retrieval-augmented generation driven by LangChain and an OpenAI large language model with intent detection and different chains for summaries, activity reports, URL or PDF lookups and regular questions. Domain metadata computation and filtering, configurable organization document via SBA_ORG_DOC in the environment, similarity search and smart connection discovery CLI scripts, a Streamlit web UI, systemd service installation script for automatic background processing, and testing and pre-commit recommendations for development.
Use Cases
The project reduces time spent searching and organizing notes by surfacing precise answers and summaries drawn from your own content. Students, professionals, researchers and creatives can query their personal knowledge base to retrieve context-aware responses, generate activity reports from journaled entries, and find new connections among notes using vector similarity. Domain metadata and filename conventions allow filtering by subject or area, and the SBA_ORG_DOC enables custom organization mappings. Automation via inotify-based processing and optional systemd services keeps the vector database up to date. A Streamlit UI and CLI tools provide accessible ways to explore, search and visualize relationships in your second brain. The README documents installation steps, dependencies and platform testing on Fedora and Ubuntu.

Please fill the required fields*