data-prep-kit / data-prep-kitLinks
Open source project for data preparation for GenAI applications
☆808Updated this week
Alternatives and similar repositories for data-prep-kit
Users that are interested in data-prep-kit are comparing it to the libraries listed below
Sorting:
- Granite Snack Cookbook -- easily consumable recipes (python notebooks) that showcase the capabilities of the Granite models☆267Updated last week
- ☆151Updated last month
- LLM Comparator is an interactive data visualization tool for evaluating and analyzing LLM responses side-by-side, developed by the PAIR t…☆486Updated 7 months ago
- An open-source tool for general prompt optimization.☆637Updated last week
- Discover, run, and compose AI agents from any framework.☆779Updated last week
- This repository shares end-to-end notebooks on how to use various Weaviate features and integrations!☆893Updated last week
- Create large-scale synthetic training data for model distillation and evaluation☆581Updated this week
- Tool for generating high quality Synthetic datasets☆1,238Updated last week
- This package, developed as part of our research detailed in the Chroma Technical Report, provides tools for text chunking and evaluation.…☆422Updated 6 months ago
- Build datasets using natural language☆529Updated 2 weeks ago
- 👩🏻🍳 A collection of example notebooks using Haystack☆504Updated this week
- ☆264Updated 3 months ago
- This NVIDIA RAG blueprint serves as a reference solution for a foundational Retrieval Augmented Generation (RAG) pipeline.☆280Updated last week
- Simple package to extract text with coordinates from programmatic PDFs☆198Updated 2 weeks ago
- OpenTelemetry Instrumentation for AI Observability☆632Updated this week
- VectorHub is a free, open-source learning website for people (software developers to senior ML architects) interested in adding vector re…☆495Updated last week
- Open protocol for communication between AI agents, applications, and humans.☆867Updated last month
- Build Research and Rag agents with Granite on your laptop☆144Updated this week
- A collection of notebooks/recipes showcasing usecases of open-source models with Together AI.☆1,058Updated last month
- Run the entire bee application stack using docker-compose☆155Updated 6 months ago
- Use late-interaction multi-modal models such as ColPali in just a few lines of code.☆826Updated 8 months ago
- A python library to define and validate data types in Docling.☆185Updated last week
- Workflows are an event-driven, async-first, step-based way to control the execution flow of AI applications like agents.☆221Updated this week
- Retrieve, Read and LinK: Fast and Accurate Entity Linking and Relation Extraction on an Academic Budget (ACL 2024)☆456Updated 2 months ago
- An Awesome list of curated DSPy resources.☆448Updated last month
- A Lightweight Library for AI Observability☆251Updated 7 months ago
- A library for prompt engineering and optimization (SAMMO = Structure-aware Multi-Objective Metaprompt Optimization)☆730Updated 3 months ago
- Semantic Chunker is a lightweight Python package for semantically-aware chunking and clustering of text.☆272Updated 5 months ago
- Build fast and accurate GenAI apps with GraphRAG SDK at scale.☆459Updated last week
- Framework for enhancing LLMs for RAG tasks using fine-tuning.☆750Updated 4 months ago