PDFStract - Extract, Chunking and Embedding Layer in Your RAG Pipeline - Available as CLI - WEBUI - API
☆151Mar 18, 2026Updated 3 months ago
Alternatives and similar repositories for pdfstract
Users that are interested in pdfstract are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Streaming Retrieval-Augmented Generation (RAG) agent in Go. It consumes real-time data from Kafka topics, processes it in configurable wi…☆27Jun 7, 2025Updated last year
- Find your files with natural language and ask questions.☆60May 27, 2026Updated last month
- Efficient MCP tool calling in code mode for Claude Code☆22Dec 12, 2025Updated 6 months ago
- Reverse-engineered Perplexity API client in Python. Facilitates WebSocket communication for real-time AI responses, maintaining session i…☆28May 9, 2024Updated 2 years ago
- ☆56Jan 12, 2026Updated 5 months ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- One library to split them all: Sentence, Code, Docs. Chunk smarter, not harder — built for LLMs, RAG pipelines, and beyond.☆79Jun 22, 2026Updated last week
- ☆40May 23, 2026Updated last month
- An open source real-time AI inference engine for seamless scaling☆23Jul 2, 2025Updated last year
- ☆25Apr 4, 2025Updated last year
- Otto is an open-source browser agent that interacts with websites like a human.☆29Dec 20, 2025Updated 6 months ago
- 🔥 LitLytics - an affordable, simple analytics platform that leverages LLMs to automate data analysis☆104Nov 25, 2024Updated last year
- rclone mod☆15Apr 24, 2024Updated 2 years ago
- 🌪️ AI research assistant that generates Wikipedia-quality articles through multi-perspective analysis. Based on Stanford's STORM methodo…☆67Jun 6, 2025Updated last year
- This repo is an approach to TDD in machine learning model operation. it covers project structure, testing essentials using pytest with Gi…☆15Dec 2, 2020Updated 5 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- PipesHub is a fully extensible and explainable workplace AI platform for enterprise search and workflow automation☆3,002Updated this week
- A powerful, yet simple to use, self-hosted redirect service☆41Jun 24, 2026Updated last week
- A blazingly fast microservice for matching ROM file hashes and caching game metadata. Originally designed for RetroRealm.☆26Jun 24, 2026Updated last week
- An AI-powered security analysis tool for web applications that combines Large Language Model (LLM) analysis with intelligent agent-based …☆46Jul 26, 2025Updated 11 months ago
- ocrbro is a dedicated light-weight n8n node which does OCR for simple Images & PDF's☆19Apr 3, 2026Updated 2 months ago
- Using deep research workflow to generate datasets for finetuning LLMs.☆40Oct 9, 2025Updated 8 months ago
- AI-powered text compression library for RAG systems and API calls. Reduce token usage by up to 50-60% while preserving semantic meaning w…☆86Aug 16, 2025Updated 10 months ago
- ☆16Jun 18, 2026Updated 2 weeks ago
- ☆30Oct 4, 2024Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- pdfLLM is a completely open source, proof of concept RAG app.☆187Sep 1, 2025Updated 10 months ago
- Chrome extension that provides comprehensive browser fingerprint protection by defending against various tracking techniques used across …☆31Oct 26, 2025Updated 8 months ago
- A natural language file search tool that uses LLMs to help you find files by describing what you're looking for.☆28Mar 8, 2025Updated last year
- Emotional status bar for Claude Code — dual-channel emotional transparency with research-backed model☆54Apr 16, 2026Updated 2 months ago
- Automated playlist generator with smarts☆83Jun 20, 2026Updated last week
- Document your Talon scripts using Sphinx.☆17Jun 22, 2026Updated last week
- An MCP (Model Context Protocol) server for interacting with a Paperless-NGX API server. This server provides tools for managing documents…☆119Jun 23, 2026Updated last week
- ☆23Jun 13, 2023Updated 3 years ago
- A markdown knowledgebase search tool combining semantic search, BM25 keyword matching, and knowledge graph traversal with reciprocal rank…☆40Jan 31, 2026Updated 5 months ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Request distributor for web scraping☆14Jun 8, 2025Updated last year
- Custom launcher for Claude Code, supporting dynamic prompts, layered configuration and easy custom hooks and MCPs.☆17Updated this week
- ISS Tracker for the Cardputer Adv☆49Jan 19, 2026Updated 5 months ago
- Text selection overlay for Talon☆10Aug 20, 2022Updated 3 years ago
- Self-hosted music discovery library manager with modern web UI, MCP tools and more...☆43Jun 23, 2026Updated last week
- Self-improving AI agents using Agentic Context Engineering - A starter implementation with Google ADK☆21Oct 23, 2025Updated 8 months ago
- the Go backend server of https://github.com/WarCluster/warcluster-client☆10Mar 17, 2016Updated 10 years ago