Stop using static chunk sizes. A lightweight, production-ready RAG ingestion toolkit. Uses Docling for layout-aware parsing and applies smart heuristics for optimal chunking (PDF vs Code vs MD). Extracted from a production RAG platform
☆64Mar 15, 2026Updated 2 weeks ago
Alternatives and similar repositories for smart-ingest-kit
Users that are interested in smart-ingest-kit are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- FastAPI + MLX offline-first voice agent with <1s latency. Minimal UI☆51Oct 21, 2025Updated 5 months ago
- A simple CPU only OCR for pdf/images/word/excel to markdown. With streamlit.☆46Jan 26, 2026Updated 2 months ago
- Simile combines the power of AI embeddings with fuzzy string matching and keyword search to deliver highly relevant search results—all ru…☆29Dec 28, 2025Updated 3 months ago
- A Python-native Terminal-Based Git Client - Navigate and manage your Git repositories with a beautiful TUI interface inspired by LazyGit.☆34Feb 7, 2026Updated last month
- Professional RAG development skills for Claude Code - audit, evaluate, optimize, and scaffold RAG pipelines☆26Jan 18, 2026Updated 2 months ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- [H] HyperspaceDB is a high-performance, hyperbolic vector database written in Rust. It features 1-bit quantization, async replication, an…☆70Mar 21, 2026Updated last week
- BTC Map API☆22Updated this week
- psychedelia syndrome: the pixels and code of a new kind of videogame☆13Mar 1, 2026Updated 3 weeks ago
- Open Source Public Repo of Microsoft Data & AI Platform☆35Nov 10, 2025Updated 4 months ago
- ☆39Nov 17, 2025Updated 4 months ago
- ☆27Aug 16, 2025Updated 7 months ago
- PRIMAVERA Extensibility Essentials☆15Nov 16, 2022Updated 3 years ago
- ☆25Feb 10, 2026Updated last month
- Pluggable sample-level metadata versioning for incremental multimodal pipelines.☆82Updated this week
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- ☆40Nov 13, 2025Updated 4 months ago
- Coordinate skills between Codex, Copilot, and Claude Code. Validates, analyzes, and syncs skills, subagents, commands, and configuration …☆56Updated this week
- ☆55Updated this week
- A multi engine TTS & LLM edge computing playground with audio book features and more!☆44Mar 21, 2026Updated last week
- Tree-based, vectorless document RAG framework. Connect any LLM via URL/API key.☆27Mar 22, 2026Updated last week
- ☆47May 19, 2025Updated 10 months ago
- Lightweight self-hosted PaaS, built with Go☆60Mar 6, 2026Updated 3 weeks ago
- This is a collecton of CDK projects to show how to load data from streaming services into Amazon Redshift.☆13Sep 10, 2024Updated last year
- Simple boids/flocking implementation in GDScript / Godot☆17Mar 31, 2021Updated 4 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Read and parse tables in P6 xer file.☆17Dec 20, 2024Updated last year
- InfraMind: Fine-tuning toolkit for training SLMs on Infrastructure-as-Code using GRPO/DAPO. Achieves 97.3% accuracy on IaC generation.☆68Dec 15, 2025Updated 3 months ago
- Linear Algebra library in GDScript for Godot Engine☆14Aug 10, 2020Updated 5 years ago
- Local CLI tool that lets you write natural language instructions and get the corresponding shell commands generated by a small language m…☆21Nov 18, 2025Updated 4 months ago
- Unofficial Logsnag client for Elixir☆13May 11, 2025Updated 10 months ago
- A script for Adobe Photoshop that randomly perturbs the font attributes of a text layer for each character in the layer☆23Apr 30, 2020Updated 5 years ago
- Local SEO and Business Listings Wordpress Plugin - Optimize your website with a Step By Step Actionable Local SEO Guide, a host of Local …☆10Jul 28, 2015Updated 10 years ago
- Ecto type for datetimes stored and cast as Unix timestamps. 🕰️☆14Dec 22, 2024Updated last year
- Scripts to automatically sync Claude Code generated TODO to TaskWarrior☆17Jun 22, 2025Updated 9 months ago
- NordVPN Special Discount Offer • AdSave on top-rated NordVPN 1 or 2-year plans with secure browsing, privacy protection, and support for for all major platforms.
- An end-to-end ES/CQRS example with EventStoreDB and Elixir☆12Jun 14, 2024Updated last year
- ☆25Jul 21, 2025Updated 8 months ago
- Ivar is an adapter based HTTP client that provides the ability to build composable HTTP requests.☆17Oct 5, 2017Updated 8 years ago
- TreeThinkerAgent is a lightweight orchestration layer that turns any LLM into an autonomous multi-step reasoning agent. It supports multi…☆21Feb 11, 2026Updated last month
- Save coding agents' conversations in Git Notes, automatically☆41Updated this week
- Middleware program which helps with talking to external programs from Elixir or Erlang.☆12Apr 17, 2021Updated 4 years ago
- A library of functional extension to the Phoenix LiveView framework.☆14Jul 13, 2021Updated 4 years ago