brandonstarxel / chunking_evaluationLinks
This package, developed as part of our research detailed in the Chroma Technical Report, provides tools for text chunking and evaluation. It allows users to compare different chunking methods and includes implementations of several novel chunking strategies.
☆331Updated 3 months ago
Alternatives and similar repositories for chunking_evaluation
Users that are interested in chunking_evaluation are comparing it to the libraries listed below
Sorting:
- Code for explaining and evaluating late chunking (chunked pooling)☆403Updated 6 months ago
- Use late-interaction multi-modal models such as ColPali in just a few lines of code.☆797Updated 4 months ago
- Benchmark various LLM Structured Output frameworks: Instructor, Mirascope, Langchain, LlamaIndex, Fructose, Marvin, Outlines, etc on task…☆173Updated 9 months ago
- ☆137Updated last week
- 👩🏻🍳 A collection of example notebooks using Haystack☆482Updated last week
- Recipes for learning, fine-tuning, and adapting ColPali to your multimodal RAG use cases. 👨🏻🍳☆313Updated 3 weeks ago
- ☆143Updated 11 months ago
- 🤗 Benchmark Large Language Models Reliably On Your Data☆329Updated this week
- Late Interaction Models Training & Retrieval☆444Updated last week
- ☆225Updated last week
- Readymade evaluators for agent trajectories☆239Updated last month
- Simple UI for debugging correlations of text embeddings☆276Updated 3 weeks ago
- This repository shares end-to-end notebooks on how to use various Weaviate features and integrations!☆777Updated this week
- FastAPI wrapper around DSPy☆247Updated last year
- Framework for enhancing LLMs for RAG tasks using fine-tuning.☆741Updated last month
- A lightweight, low-dependency, unified API to use all common reranking and cross-encoder models.☆1,449Updated 3 weeks ago
- A fast, lightweight and easy-to-use Python library for splitting text into semantically meaningful chunks.☆327Updated 2 weeks ago
- ☆180Updated 6 months ago
- A Lightweight Library for AI Observability☆245Updated 4 months ago
- ☆195Updated last year
- In-Context Learning for eXtreme Multi-Label Classification (XMC) using only a handful of examples.☆426Updated last year
- A comprehensive guide to LLM evaluation methods designed to assist in identifying the most suitable evaluation techniques for various use…☆121Updated last month
- This repo is the central repo for all the RAG Evaluation reference material and partner workshop☆65Updated last month
- Semantic Chunker is a lightweight Python package for semantically-aware chunking and clustering of text.☆255Updated 2 months ago
- Generate large synthetic data using an LLM☆426Updated this week
- ☆122Updated 3 months ago
- ☆98Updated 6 months ago
- ☆143Updated last month
- this project will bootstrap and scaffold the projects for specific semantic search and RAG applications along with regular boiler plate c…☆90Updated 6 months ago
- Lite & Super-fast re-ranking for your search & retrieval pipelines. Supports SoTA Listwise and Pairwise reranking based on LLMs and cro…☆815Updated 6 months ago