agamm / semantic-split
A Python library to chunk/group your texts based on semantic similarity.
☆93Updated 7 months ago
Alternatives and similar repositories for semantic-split:
Users that are interested in semantic-split are comparing it to the libraries listed below
- Benchmark various LLM Structured Output frameworks: Instructor, Mirascope, Langchain, LlamaIndex, Fructose, Marvin, Outlines, etc on task…☆147Updated 4 months ago
- A fast, lightweight and easy-to-use Python library for splitting text into semantically meaningful chunks.☆239Updated this week
- ARAGOG- Advanced RAG Output Grading. Exploring and comparing various Retrieval-Augmented Generation (RAG) techniques on AI research paper…☆101Updated 10 months ago
- LitePali is a minimal, efficient implementation of ColPali for image retrieval and indexing, optimized for cloud deployment.☆38Updated 4 months ago
- Dense X Retrieval: What Retrieval Granularity Should We Use?☆145Updated last year
- Official code of the paper "SimGRAG: Leveraging Similar Subgraphs for Knowledge Graphs Driven Retrieval-Augmented Generation"☆93Updated last month
- DocLLM: A layout-aware generative language model for multimodal document understanding☆119Updated last year
- 🦜💯 Flex those feathers!☆239Updated 3 months ago
- ☆173Updated last week
- This repo is for handling Question Answering, especially for Multi-hop Question Answering☆67Updated last year
- Repository to demonstrate Chain of Table reasoning with multiple tables powered by LangGraph☆145Updated 10 months ago
- ☆115Updated 2 weeks ago
- High level library for batched embeddings generation, blazingly-fast web-based RAG and quantized indexes processing ⚡☆64Updated 3 months ago
- ☆205Updated 2 months ago
- Repository for “PlanRAG: A Plan-then-Retrieval Augmented Generation for Generative Large Language Models as Decision Makers”, NAACL24☆134Updated 8 months ago
- ☆38Updated last year
- DSPY on action with OpenSource LLMs.☆64Updated 10 months ago
- RAGElo is a set of tools that helps you selecting the best RAG-based LLM agents by using an Elo ranker☆106Updated this week
- This package, developed as part of our research detailed in the Chroma Technical Report, provides tools for text chunking and evaluation.…☆230Updated 4 months ago
- Code and data for "StructLM: Towards Building Generalist Models for Structured Knowledge Grounding" (COLM 2024)☆76Updated 3 months ago
- Additional packages (components, document stores and the likes) to extend the capabilities of Haystack version 2.0 and onwards☆132Updated this week
- Code repo for "Agent Instructs Large Language Models to be General Zero-Shot Reasoners"☆98Updated 5 months ago
- Generalist and Lightweight Model for Text Classification☆65Updated 3 weeks ago
- Vision Document Retrieval (ViDoRe): Benchmark. Evaluation code for the ColPali paper.☆173Updated this week
- ☆139Updated 6 months ago
- Baguetter is a flexible, efficient, and hackable search engine library implemented in Python. It's designed for quickly benchmarking, imp…☆171Updated 5 months ago
- ☆91Updated last year
- Data extraction with LLM on CPU☆112Updated last year
- Code for explaining and evaluating late chunking (chunked pooling)☆321Updated last month
- Excel spreadsheet crawler and table parser for data extraction and querying☆124Updated 2 months ago