Stop using static chunk sizes. A lightweight, production-ready RAG ingestion toolkit. Uses Docling for layout-aware parsing and applies smart heuristics for optimal chunking (PDF vs Code vs MD). Extracted from a production RAG platform
☆69Mar 15, 2026Updated 3 months ago
Alternatives and similar repositories for smart-ingest-kit
Users that are interested in smart-ingest-kit are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Enterprise-grade Retrieval-Augmented Generation system with microservices architecture.☆22Mar 15, 2026Updated 3 months ago
- A simple CPU only OCR for pdf/images/word/excel to markdown. With streamlit.☆51Jan 26, 2026Updated 5 months ago
- Simile combines the power of AI embeddings with fuzzy string matching and keyword search to deliver highly relevant search results—all ru…☆31Dec 28, 2025Updated 5 months ago
- Fine-tune LLMs and ML models with automatic dataset conversion, hyperparameter sweeps, and custom RL environments☆53May 17, 2026Updated last month
- ☆52Nov 18, 2025Updated 7 months ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- A Python-native Terminal-Based Git Client - Navigate and manage your Git repositories with a beautiful TUI interface inspired by LazyGit.☆36Feb 7, 2026Updated 4 months ago
- Turning messy repos into weapons of mass structured context.☆23Feb 20, 2026Updated 4 months ago
- Middleware for AI Agents that verifies grounding and prevents hallucinations. Returns structured retry suggestions for self-correction.☆51Dec 11, 2025Updated 6 months ago
- ☆70Updated this week
- A framework for creating message-driven training systems with PyTorch☆21Oct 7, 2025Updated 8 months ago
- Diff filtering, text mapping, and windowed transforms for LLM apps☆24Jun 13, 2026Updated 2 weeks ago
- Random AI notes for working with local models or playing around with random machine learning bits.☆58Jun 7, 2026Updated 3 weeks ago
- A Python utility for seamless import and export of n8n workflows and credentials. Automates migration between environments, simplifies ba…☆13Apr 8, 2025Updated last year
- A collection of pipelines for Scrapy☆16Apr 27, 2026Updated 2 months ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Running Whisper, some LLM and Functionary on an Instinct Mi50 for HomeAssistant☆17Mar 9, 2025Updated last year
- A fast, Linux‑native desktop GUI for Ollama. Built with Tauri 2 (Rust) and React + TypeScript.☆53May 26, 2026Updated last month
- Local Development of AWS Glue with Docker and Visual Studio Code☆14Nov 29, 2021Updated 4 years ago
- ☆26May 27, 2026Updated last month
- ☆22Mar 15, 2024Updated 2 years ago
- PRIMAVERA Extensibility Essentials☆16Nov 16, 2022Updated 3 years ago
- ☆233May 20, 2026Updated last month
- GigaChat API совместимый с OpenAI☆22Oct 27, 2025Updated 8 months ago
- ☆17Jun 6, 2025Updated last year
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- This is a collecton of CDK projects to show how to load data from streaming services into Amazon Redshift.☆13Sep 10, 2024Updated last year
- A multi engine TTS & LLM edge computing playground with audio book features and more!☆54Jun 19, 2026Updated last week
- Resources and notebooks to accompany the Duplicate Detection using GenAI paper☆16Jul 1, 2024Updated last year
- A AI avatar generator for video platforms.☆22Jan 19, 2025Updated last year
- ☆44Jan 19, 2026Updated 5 months ago
- ☆47May 19, 2025Updated last year
- Unofficial Logsnag client for Elixir☆13May 11, 2025Updated last year
- Local SEO and Business Listings Wordpress Plugin - Optimize your website with a Step By Step Actionable Local SEO Guide, a host of Local …☆10Jul 28, 2015Updated 10 years ago
- Scripts to automatically sync Claude Code generated TODO to TaskWarrior☆17Jun 22, 2025Updated last year
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- Deep research agentic system using Time Test Diffusion☆46Dec 11, 2025Updated 6 months ago
- An end-to-end ES/CQRS example with EventStoreDB and Elixir☆12Jun 14, 2024Updated 2 years ago
- ☆12Jan 15, 2024Updated 2 years ago
- Production-ready Python library for multi-provider LLM orchestration☆41Jun 19, 2026Updated last week
- Ivar is an adapter based HTTP client that provides the ability to build composable HTTP requests.☆17Oct 5, 2017Updated 8 years ago
- Save coding agents' conversations in Git Notes, automatically☆46Updated this week
- Middleware program which helps with talking to external programs from Elixir or Erlang.☆12Apr 17, 2021Updated 5 years ago