Stop using static chunk sizes. A lightweight, production-ready RAG ingestion toolkit. Uses Docling for layout-aware parsing and applies smart heuristics for optimal chunking (PDF vs Code vs MD). Extracted from a production RAG platform
☆69Mar 15, 2026Updated 2 months ago
Alternatives and similar repositories for smart-ingest-kit
Users that are interested in smart-ingest-kit are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Enterprise-grade Retrieval-Augmented Generation system with microservices architecture.☆21Mar 15, 2026Updated 2 months ago
- Self-Extensible Multi Agent Assistant 🐋☆58Feb 13, 2026Updated 3 months ago
- FastAPI + MLX offline-first voice agent with <1s latency. Minimal UI☆56Oct 21, 2025Updated 6 months ago
- A Docker-powered RAG system that understands the difference between code and prose. Ingest your codebase and documentation, then query th…☆251Mar 14, 2026Updated 2 months ago
- A simple CPU only OCR for pdf/images/word/excel to markdown. With streamlit.☆49Jan 26, 2026Updated 3 months ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- A Python-native Terminal-Based Git Client - Navigate and manage your Git repositories with a beautiful TUI interface inspired by LazyGit.☆34Feb 7, 2026Updated 3 months ago
- ☆61May 12, 2026Updated last week
- A modern desktop application for exploring, managing, and analyzing vector databases☆223Apr 28, 2026Updated 3 weeks ago
- Middleware for AI Agents that verifies grounding and prevents hallucinations. Returns structured retry suggestions for self-correction.☆51Dec 11, 2025Updated 5 months ago
- A framework for creating message-driven training systems with PyTorch☆21Oct 7, 2025Updated 7 months ago
- Self hosted application to access IMAP accounts over REST☆12Mar 22, 2020Updated 6 years ago
- A Python utility for seamless import and export of n8n workflows and credentials. Automates migration between environments, simplifies ba…☆13Apr 8, 2025Updated last year
- A boilerplate for publishing books as Jekyll blogs☆20Nov 30, 2025Updated 5 months ago
- Open Source Public Repo of Microsoft Data & AI Platform☆35Nov 10, 2025Updated 6 months ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- ☆17Oct 1, 2023Updated 2 years ago
- ☆39Nov 17, 2025Updated 6 months ago
- A fast, Linux‑native desktop GUI for Ollama. Built with Tauri 2 (Rust) and React + TypeScript.☆49Apr 16, 2026Updated last month
- ☆23Apr 30, 2026Updated 2 weeks ago
- Procedurally generated 3D drawing, part of the V series.☆10Jun 14, 2015Updated 10 years ago
- browser extension to scroll a page with j and k and a little bit more☆16Apr 6, 2026Updated last month
- 🏭 The open-source Palantir Foundry alternative. Connect any data source, build ontologies, create pipelines, visualize with dashboards, …☆192Updated this week
- PRIMAVERA Extensibility Essentials☆16Nov 16, 2022Updated 3 years ago
- [H] HyperspaceDB is a high-performance, vector database. It features 1-bit quantization, async replication, and native support for hierar…☆113Apr 20, 2026Updated 3 weeks ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- An opinionated theme-aware shell prompt☆19Sep 24, 2025Updated 7 months ago
- ☆38Nov 13, 2025Updated 6 months ago
- A library for structural-semantic chunking of documents.☆12Oct 8, 2025Updated 7 months ago
- GigaChat API совместимый с OpenAI☆22Oct 27, 2025Updated 6 months ago
- Pluggable sample-level metadata versioning for incremental multimodal pipelines.☆96Updated this week
- Coordinate skills between Codex, Copilot, and Claude Code. Validates, analyzes, and syncs skills, subagents, commands, and configuration …☆63May 10, 2026Updated last week
- The Future of AI-Powered Development: Orchestr8 Transforms Claude Code Into a Complete Software Engineering Team☆65Nov 13, 2025Updated 6 months ago
- Resources and notebooks to accompany the Duplicate Detection using GenAI paper☆16Jul 1, 2024Updated last year
- ☆47May 19, 2025Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Lightweight self-hosted PaaS, built with Go☆61Apr 30, 2026Updated 2 weeks ago
- Cross-machine AI agent communication, plus a mobile app to control any terminal on your machine.☆114May 11, 2026Updated last week
- Read and parse tables in P6 xer file.☆19Dec 20, 2024Updated last year
- Unofficial Logsnag client for Elixir☆13May 11, 2025Updated last year
- Local CLI tool that lets you write natural language instructions and get the corresponding shell commands generated by a small language m…☆21Nov 18, 2025Updated 6 months ago
- Local SEO and Business Listings Wordpress Plugin - Optimize your website with a Step By Step Actionable Local SEO Guide, a host of Local …☆10Jul 28, 2015Updated 10 years ago
- Scripts to automatically sync Claude Code generated TODO to TaskWarrior☆17Jun 22, 2025Updated 10 months ago