OpenContracts
Basic Information
OpenContracts is a free, GPL-3 licensed open source document analytics platform for enterprise use that focuses on ingesting, annotating and extracting data from PDF and text-based documents. It provides a web application and backend designed to manage document collections (Corpuses), parse document layout, generate vector embeddings, and display analysis and annotations over original documents. The project emphasizes a pluggable pipeline and microservice architecture so teams can add new parsers, embedders and thumbnailers, and it includes integrations such as LlamaIndex and a Django + pgvector-backed vector store for hybrid vector/metadata queries. The repo includes documentation, example parsers like Docling and NLM ingestors, a human annotation UI, and tooling to build custom data extractors and bespoke analytics. It is intended for organizations that need scalable, auditable document analysis and LLM-assisted querying of legal or unstructured documents.