This is a local, offline-capable RAG-powered web app for interacting with PDFs. Built using Streamlit, LangChain, and Ollama (or Sambanova), it allows users to upload PDFs, extract and chunk content (even converting to Markdown via OCR with Marker), embed that content into a local ChromaDB vector store, and chat with it through a conversational interface—all running locally on your machine
Key Features
Fully local execution with optional remote LLM fallback
Supports both standard PDF text extraction and OCR-based Markdown conversion
Embeds data into ChromaDB vector store for semantic search
Hybrid retrieval and re-ranking techniques (semantic similarity + BM25)
Choose between Ollama (local) or Sambanova (API) LLMs at runtime
Configurable chat UI with memory reset, chunk controls, and conversation export
Works without internet, suitable for private or secure environments