BatsResearch / bonitoLinks
A lightweight library for generating synthetic instruction tuning datasets for your data without GPT.
โ819Updated 6 months ago
Alternatives and similar repositories for bonito
Users that are interested in bonito are comparing it to the libraries listed below
Sorting:
- DataDreamer: Prompt. Generate Synthetic Data. Train & Align Models. โ ๐ค๐คโ1,093Updated last year
- Automated Evaluation of RAG Systemsโ689Updated 10 months ago
- Evaluate your LLM's response with Prometheus and GPT4 ๐ฏโ1,043Updated 9 months ago
- A lightweight, low-dependency, unified API to use all common reranking and cross-encoder models.โ1,594Updated last month
- Fine-Tuning Embedding for RAG with Synthetic Dataโ523Updated 2 years ago
- Code for explaining and evaluating late chunking (chunked pooling)โ487Updated last year
- Easily embed, cluster and semantically label text datasetsโ592Updated last year
- Generative Representational Instruction Tuningโ686Updated 7 months ago
- Efficient Retrieval Augmentation and Generation Frameworkโ1,766Updated last month
- โ907Updated last year
- [ACL2023] We introduce LLM-Blender, an innovative ensembling framework to attain consistently superior performance by leveraging the diveโฆโ976Updated last year
- Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verifiโฆโ3,084Updated 2 weeks ago
- Automatically evaluate your LLMs in Google Colabโ685Updated last year
- The official implementation of RAPTOR: Recursive Abstractive Processing for Tree-Organized Retrievalโ1,572Updated last year
- RankLLM is a Python toolkit for reproducible information retrieval research using rerankers, with a focus on listwise reranking.โ575Updated this week
- Fast lexical search implementing BM25 in Python using Numpy, Numba and Scipyโ1,477Updated last week
- Train Models Contrastively in Pytorchโ775Updated 10 months ago
- Meta-Prompting: Enhancing Language Models with Task-Agnostic Scaffoldingโ418Updated 2 years ago
- [ICLR 2025] Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing. Your efficient and high-quality synthetic data โฆโ826Updated 10 months ago
- [NeurIPS 2024 Spotlight] Buffer of Thoughts: Thought-Augmented Reasoning with Large Language Modelsโ677Updated 7 months ago
- This includes the original implementation of SELF-RAG: Learning to Retrieve, Generate and Critique through self-reflection by Akari Asai,โฆโ2,315Updated last year
- Framework for enhancing LLMs for RAG tasks using fine-tuning.โ765Updated last month
- RAGChecker: A Fine-grained Framework For Diagnosing RAGโ1,057Updated last year
- Repository for "MultiHop-RAG: A Dataset for Evaluating Retrieval-Augmented Generation Across Documents" (COLM 2024)โ422Updated 10 months ago
- Open-source tool to visualise your RAG ๐ฎโ1,216Updated last year
- โ1,197Updated last month
- Best practices for distilling large language models.โ604Updated 2 years ago
- โ1,033Updated last year
- Data and tools for generating and inspecting OLMo pre-training data.โ1,404Updated 3 months ago
- Official repository for ORPOโ469Updated last year