brandonstarxel / chunking_evaluation
This package, developed as part of our research detailed in the Chroma Technical Report, provides tools for text chunking and evaluation. It allows users to compare different chunking methods and includes implementations of several novel chunking strategies.
☆91Updated 2 months ago
Related projects: ⓘ
- ☆172Updated 4 months ago
- ☆56Updated last week
- A simple Python sandbox for helpful LLM data agents☆143Updated 3 months ago
- FastAPI wrapper around DSPy☆201Updated 6 months ago
- Benchmark various LLM Structured Output frameworks: Instructor, Mirascope, Langchain, LlamaIndex, Fructose, Marvin, Outlines, etc on task…☆117Updated 3 weeks ago
- ☆20Updated 3 months ago
- Use late-interaction multi-modal models such as ColPali in just a few lines of code.☆330Updated this week
- ☆172Updated 11 months ago
- Python SDK for experimenting, testing, evaluating & monitoring LLM-powered applications - Parea AI (YC S23)☆72Updated last week
- ☆58Updated 3 weeks ago
- Vision Document Retrieval (ViDoRe): Benchmark. Evaluation code for the ColPali paper.☆101Updated last week
- Additional packages (components, document stores and the likes) to extend the capabilities of Haystack version 2.0 and onwards☆105Updated this week
- Repository to demonstrate Chain of Table reasoning with multiple tables powered by LangGraph☆143Updated 5 months ago
- Generate Synthetic Data Using OpenAI, MistralAI or AnthropicAI☆223Updated 4 months ago
- Claude API Test Project☆87Updated 4 months ago
- 📝 Reference-Free automatic summarization evaluation with potential hallucination detection☆99Updated 8 months ago
- Using various instructor clients evaluating the quality and capabilities of extractions and reasoning.☆45Updated last week
- Tutorial for building LLM router☆145Updated 2 months ago
- A Ruby on Rails style framework for the DSPy (Demonstrate, Search, Predict) project for Language Models like GPT, BERT, and LLama.☆101Updated this week
- In-Context Learning for eXtreme Multi-Label Classification (XMC) using only a handful of examples.☆362Updated 7 months ago
- Doing simple retrieval from LLM models at various context lengths to measure accuracy☆93Updated 5 months ago
- WIP - Allows you to create DSPy pipelines using ComfyUI☆170Updated last month
- Domain Adapted Language Modeling Toolkit - E2E RAG☆295Updated 3 months ago
- Synthetic Data for LLM Fine-Tuning☆78Updated 9 months ago
- A stable, fast and easy-to-use inference library with a focus on a sync-to-async API☆44Updated last month
- Steer LLM outputs towards a certain topic/subject and enhance response capabilities using activation engineering by adding steering vecto…☆192Updated 4 months ago
- This project enhances the construction of RAG applications by addressing challenges, improving accessibility, scalability, and managing d…☆135Updated 5 months ago
- Efficient vector database for hundred millions of embeddings.☆196Updated 4 months ago
- awesome synthetic (text) datasets☆213Updated this week
- ☆86Updated this week