serverless-rag-demo

Report Abuse

Basic Information

This repository provides a production-ready, easily deployable serverless Retrieval Augmented Generation (RAG) solution built on AWS services to augment foundation models with domain-specific content. It demonstrates document chat and document management workflows that convert documents and queries into embeddings, perform similarity search using the Amazon Opensearch Serverless vector engine, and append retrieved context to prompts sent to Amazon Bedrock models. The project also includes multi-agent collaboration through the Strands SDK, OCR for ingesting documents, PII redaction, sentiment analysis, and a document-aware chunking strategy for cross-document comparison. The README documents a guided CloudShell-based deployment using a creator script, an AppRunner-hosted UI protected by Amazon Cognito, Lambda functions and API Gateway, and instructions to integrate with existing Bedrock knowledge bases and embedding models. The codebase targets AWS practitioners who want an end-to-end serverless RAG demo configured for Anthropic Claude and Cohere embed models.

Links

Categorization

App Details

Features
Document chat and document management with multilingual support and a document-aware chunking approach for answering and comparing content across documents. Vector store integration using Amazon Opensearch Serverless vector engine for fast similarity search and reduced vector infrastructure management. Multi-agent orchestration and collaboration implemented with the Strands SDK to support agentic workflows. OCR ingest pipeline to extract text from documents and PII redaction to sanitize sensitive information before indexing or display. Sentiment analysis for content insights. Support for Anthropic Claude model families and Cohere embedding models via Amazon Bedrock. Automated deployment instructions and scripts for CloudShell, AppRunner-hosted UI with Cognito authentication, Lambda functions and API Gateway configuration, and explicit steps to integrate with an existing Bedrock knowledge base and adjust Opensearch data access policies.
Use Cases
The project helps teams quickly deploy a serverless RAG stack that augments large language models with relevant external documents, improving domain-specific output quality. By using Amazon Opensearch Serverless vector engine it avoids manual vector database management and provides scalable similarity search performance. Built-in OCR and PII redaction simplify ingestion and safe handling of documents, while sentiment analysis adds content-level signals. Multi-agent orchestration via Strands SDK enables more complex, coordinated workflows rather than single-turn queries. The included CloudShell deployment script, AppRunner UI with Cognito authentication, and Lambda-based serverless architecture reduce operational setup time and provide a reproducible pattern for AWS practitioners to prototype or productionize RAG-powered document chat and knowledge-base integrations with Bedrock models.

Please fill the required fields*