brandonstarxel / chunking_evaluation
This package, developed as part of our research detailed in the Chroma Technical Report, provides tools for text chunking and evaluation. It allows users to compare different chunking methods and includes implementations of several novel chunking strategies.
☆160Updated last month
Related projects ⓘ
Alternatives and complementary repositories for chunking_evaluation
- In-Context Learning for eXtreme Multi-Label Classification (XMC) using only a handful of examples.☆385Updated 9 months ago
- Benchmark various LLM Structured Output frameworks: Instructor, Mirascope, Langchain, LlamaIndex, Fructose, Marvin, Outlines, etc on task…☆133Updated last month
- ☆182Updated 6 months ago
- Generalist and Lightweight Model for Relation Extraction (Extract any relationship types from text)☆128Updated this week
- ☆180Updated last week
- awesome synthetic (text) datasets☆242Updated 3 weeks ago
- ARAGOG- Advanced RAG Output Grading. Exploring and comparing various Retrieval-Augmented Generation (RAG) techniques on AI research paper…☆96Updated 7 months ago
- ☆67Updated 3 weeks ago
- Use late-interaction multi-modal models such as ColPali in just a few lines of code.☆617Updated last week
- ☆131Updated 4 months ago
- Automated knowledge graph creation SDK☆112Updated 4 months ago
- Recipes for learning, fine-tuning, and adapting ColPali to your multimodal RAG use cases. 👨🏻🍳☆175Updated this week
- Code for explaining and evaluating late chunking (chunked pooling)☆246Updated last month
- FastAPI wrapper around DSPy☆214Updated 8 months ago
- Building a chatbot powered with a RAG pipeline to read,summarize and quote the most relevant papers related to the user query.☆162Updated 6 months ago
- An Awesome list of curated DSPy resources.☆226Updated 2 months ago
- Additional packages (components, document stores and the likes) to extend the capabilities of Haystack version 2.0 and onwards☆120Updated this week
- ☆204Updated 4 months ago
- This software contains an agent based on LangGraph & LangChain for solving general requests in the Whatsapp channel of this medical clini…☆172Updated last month
- This repo is the central repo for all the RAG Evaluation reference material and partner workshop☆50Updated last month
- RAGArch is a Streamlit-based application that empowers users to experiment with various components and parameters of Retrieval-Augmented …☆80Updated 9 months ago
- Let's build better datasets, together!☆205Updated this week
- LLM-driven automated knowledge graph construction from text using DSPy and Neo4j.☆154Updated 7 months ago
- Toolkit for attaching, training, saving and loading of new heads for transformer models☆246Updated 2 weeks ago
- This project involves using llamaindex Multi Agents concierge system and Qdrant vector database to customize the RAG application with use…☆43Updated 3 months ago
- Domain Adapted Language Modeling Toolkit - E2E RAG☆311Updated last week
- 🦜💯 Flex those feathers!☆234Updated 3 weeks ago
- A fast and lightweight pure Python library for splitting text into semantically meaningful chunks.☆182Updated 4 months ago
- Baguetter is a flexible, efficient, and hackable search engine library implemented in Python. It's designed for quickly benchmarking, imp…☆162Updated 2 months ago
- Retrieve, Read and LinK: Fast and Accurate Entity Linking and Relation Extraction on an Academic Budget (ACL 2024)☆332Updated last month