2dogsandanerd / smart-ingest-kitView external linksLinks
Stop using static chunk sizes. A lightweight, production-ready RAG ingestion toolkit. Uses Docling for layout-aware parsing and applies smart heuristics for optimal chunking (PDF vs Code vs MD). Extracted from a production RAG platform
☆63Nov 25, 2025Updated 2 months ago
Alternatives and similar repositories for smart-ingest-kit
Users that are interested in smart-ingest-kit are comparing it to the libraries listed below
Sorting:
- Open Source Public Repo of Microsoft Data & AI Platform☆34Nov 10, 2025Updated 3 months ago
- A simple CPU only OCR for pdf/images/word/excel to markdown. With streamlit.☆45Jan 26, 2026Updated 3 weeks ago
- ☆10Jun 29, 2021Updated 4 years ago
- ☆14Dec 7, 2025Updated 2 months ago
- A modern desktop application for exploring, managing, and analyzing vector databases☆178Jan 30, 2026Updated 2 weeks ago
- A full-stack AI-powered business intelligence tool for non-experts, featuring serverless backend processing and a secure Streamlit fronte…☆25Jan 6, 2026Updated last month
- This project aims to build a traveling recommendation application using Google Places API and OpenAI LLM.☆11Mar 19, 2024Updated last year
- GridDB Foreign Data Wrapper for PostgreSQL☆13Feb 10, 2025Updated last year
- ☆16Jan 23, 2026Updated 3 weeks ago
- Hands-on with GitHub Copilot: Building AI-Powered Study Plans with GitHub Models☆17Oct 8, 2025Updated 4 months ago
- ☆15Jul 31, 2025Updated 6 months ago
- ☆12Oct 25, 2023Updated 2 years ago
- Local Development of AWS Glue with Docker and Visual Studio Code☆14Nov 29, 2021Updated 4 years ago
- Used GPT for Realtime AI (Artificial intelligence) tutor to help students, learn by talking screenshots of there work.☆13May 14, 2024Updated last year
- A MCP Task Server☆11Mar 7, 2025Updated 11 months ago
- This open-source project delivers a complete pipeline for converting multi-page documents (PDFs/images) into structured JSON using Vision…☆15Aug 4, 2025Updated 6 months ago
- ☆15Feb 5, 2026Updated last week
- GitHub Copilot Adoption Plan - Workshops - Full Solution☆16Jan 28, 2026Updated 2 weeks ago
- A team of AI agents that work together to assist the user to achieve his/her needs | Email management | Calendar management | Web explora…☆16Aug 17, 2024Updated last year
- Simile combines the power of AI embeddings with fuzzy string matching and keyword search to deliver highly relevant search results—all ru…☆27Dec 28, 2025Updated last month
- A CrewAI agent based app that helps you in finding flights and planning your itinerary at the destination with top recommended places to …☆15Nov 30, 2024Updated last year
- Local CLI tool that lets you write natural language instructions and get the corresponding shell commands generated by a small language m…☆21Nov 18, 2025Updated 2 months ago
- AutonomousSphere is an agentic collaboration server. Agents talk, act, and use tools like teammates. Federated servers form an internet o…☆16May 13, 2025Updated 9 months ago
- Indexing framework designed for the automated creation of structured knowledge bases in Azure AI Search☆14Jun 18, 2025Updated 7 months ago
- OpenBEXI is a WYSIWYG HTML builder. By resizing, dragging and dropping various HTML widgets from any Web Browsers, it is easy to build a …☆12Dec 10, 2025Updated 2 months ago
- Github Repo for the Fabric AI Hackathon. Feel free to fork and have fun with it.☆11Mar 7, 2024Updated last year
- Setup Clawd Bot automatically on Orgo. Free.☆40Jan 24, 2026Updated 3 weeks ago
- GitHub Copilot Adoption Plan - Workshops - Labs☆18Sep 18, 2025Updated 4 months ago
- A workshop for developing with the Azure SQL Database and Azure Services☆13Jan 22, 2026Updated 3 weeks ago
- ☆13Nov 5, 2024Updated last year
- This repository hosts the instructions and workshop materials for Lab 333 - Evaluate Reasoning Models for Your Generative AI Solutions☆19May 21, 2025Updated 8 months ago
- Food Recommendation ChatBot☆10Dec 23, 2016Updated 9 years ago
- Example for agent orchestration☆19Mar 31, 2025Updated 10 months ago
- Curate, evaluate, and ship LLM datasets from any document.☆64Feb 7, 2026Updated last week
- GPT API Cost Estimation for Enterprises☆13Oct 24, 2023Updated 2 years ago
- SQLGPT is an advanced SQL query generator powered by natural language processing. Seamlessly transforming plain English queries into comp…☆10Oct 24, 2023Updated 2 years ago
- Salesforce SFDX project that holds the LWC and Aura components for Lighting Out to a Node.js server and Visualforce page.☆10Jul 2, 2025Updated 7 months ago
- A python script that downloads your whole suno library with your Token and URL entered.☆35Oct 30, 2025Updated 3 months ago
- A library for structural-semantic chunking of documents.☆12Oct 8, 2025Updated 4 months ago