Stop using static chunk sizes. A lightweight, production-ready RAG ingestion toolkit. Uses Docling for layout-aware parsing and applies smart heuristics for optimal chunking (PDF vs Code vs MD). Extracted from a production RAG platform
☆67Mar 15, 2026Updated last month
Alternatives and similar repositories for smart-ingest-kit
Users that are interested in smart-ingest-kit are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Enterprise-grade Retrieval-Augmented Generation system with microservices architecture.☆13Mar 15, 2026Updated last month
- Self-Extensible Multi Agent Assistant 🐋☆54Feb 13, 2026Updated 2 months ago
- A Docker-powered RAG system that understands the difference between code and prose. Ingest your codebase and documentation, then query th…☆237Mar 14, 2026Updated last month
- Zora — a long‑running local AI agent with provider registry and secure tool access.☆60Mar 25, 2026Updated 3 weeks ago
- A simple CPU only OCR for pdf/images/word/excel to markdown. With streamlit.☆47Jan 26, 2026Updated 2 months ago
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- Simile combines the power of AI embeddings with fuzzy string matching and keyword search to deliver highly relevant search results—all ru…☆30Dec 28, 2025Updated 3 months ago
- A Python-native Terminal-Based Git Client - Navigate and manage your Git repositories with a beautiful TUI interface inspired by LazyGit.☆33Feb 7, 2026Updated 2 months ago
- Augmented AI decision framework☆27Jan 26, 2026Updated 2 months ago
- A Paperless-ngx consume script that leverages Docling to provide superior OCR and layout analysis for PDFs, Office documents, and images.☆15Dec 7, 2025Updated 4 months ago
- A modern desktop application for exploring, managing, and analyzing vector databases☆208Apr 1, 2026Updated 2 weeks ago
- ☆10Jun 29, 2021Updated 4 years ago
- Turning messy repos into weapons of mass structured context.☆22Feb 20, 2026Updated last month
- Diff filtering, text mapping, and windowed transforms for LLM apps☆22Sep 19, 2025Updated 7 months ago
- [H] HyperspaceDB is a high-performance, hyperbolic vector database written in Rust. It features 1-bit quantization, async replication, an…☆78Apr 7, 2026Updated last week
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- A fast, Linux‑native desktop GUI for Ollama. Built with Tauri 2 (Rust) and React + TypeScript.☆45Updated this week
- A boilerplate for publishing books as Jekyll blogs☆19Nov 30, 2025Updated 4 months ago
- Open Source Public Repo of Microsoft Data & AI Platform☆35Nov 10, 2025Updated 5 months ago
- ☆39Nov 17, 2025Updated 5 months ago
- Procedurally generated 3D drawing, part of the V series.☆10Jun 14, 2015Updated 10 years ago
- ☆25Feb 10, 2026Updated 2 months ago
- Coordinate skills between Codex, Copilot, and Claude Code. Validates, analyzes, and syncs skills, subagents, commands, and configuration …☆59Updated this week
- A library for structural-semantic chunking of documents.☆12Oct 8, 2025Updated 6 months ago
- Session-Driven Development - Maintain perfect context across AI coding sessions with Claude Code☆59Jan 16, 2026Updated 3 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- A multi engine TTS & LLM edge computing playground with audio book features and more!☆47Apr 13, 2026Updated last week
- Resources and notebooks to accompany the Duplicate Detection using GenAI paper☆16Jul 1, 2024Updated last year
- This is a collecton of CDK projects to show how to load data from streaming services into Amazon Redshift.☆13Sep 10, 2024Updated last year
- Read and parse tables in P6 xer file.☆19Dec 20, 2024Updated last year
- Secure shell command execution MCP server for Claude AI. Enables controlled shell access within specified directories.☆18Aug 19, 2025Updated 8 months ago
- Tree-based, vectorless document RAG framework. Connect any LLM via URL/API key.☆35Apr 7, 2026Updated last week
- Unofficial Logsnag client for Elixir☆13May 11, 2025Updated 11 months ago
- FastFileLink CLI - Turn any file or folder into a secure, sharable HTTPS link. 🔄 Direct transfer (P2P) .⚡ Absolute privacy (E2EE).🔒 Sin…☆76Apr 13, 2026Updated last week
- Local SEO and Business Listings Wordpress Plugin - Optimize your website with a Step By Step Actionable Local SEO Guide, a host of Local …☆10Jul 28, 2015Updated 10 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Ecto type for datetimes stored and cast as Unix timestamps. 🕰️☆14Dec 22, 2024Updated last year
- An end-to-end ES/CQRS example with EventStoreDB and Elixir☆12Jun 14, 2024Updated last year
- Ivar is an adapter based HTTP client that provides the ability to build composable HTTP requests.☆17Oct 5, 2017Updated 8 years ago
- TreeThinkerAgent is a lightweight orchestration layer that turns any LLM into an autonomous multi-step reasoning agent. It supports multi…☆21Feb 11, 2026Updated 2 months ago
- Middleware program which helps with talking to external programs from Elixir or Erlang.☆12Apr 17, 2021Updated 5 years ago
- A library of functional extension to the Phoenix LiveView framework.☆14Jul 13, 2021Updated 4 years ago
- Create streams of Elixir structs, maps with atom keys, and keyword lists from CSV/TSV data streams☆20Sep 8, 2019Updated 6 years ago