BatsResearch / bonitoLinks
A lightweight library for generating synthetic instruction tuning datasets for your data without GPT.
β787Updated last month
Alternatives and similar repositories for bonito
Users that are interested in bonito are comparing it to the libraries listed below
Sorting:
- DataDreamer: Prompt. Generate Synthetic Data. Train & Align Models. β π€π€β1,044Updated 6 months ago
- Evaluate your LLM's response with Prometheus and GPT4 π―β979Updated 3 months ago
- Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verifiβ¦β2,849Updated last week
- Automated Evaluation of RAG Systemsβ643Updated 4 months ago
- Easily embed, cluster and semantically label text datasetsβ566Updated last year
- Efficient Retrieval Augmentation and Generation Frameworkβ1,639Updated 7 months ago
- A lightweight, low-dependency, unified API to use all common reranking and cross-encoder models.β1,522Updated 2 months ago
- The official implementation of RAPTOR: Recursive Abstractive Processing for Tree-Organized Retrievalβ1,365Updated 11 months ago
- This includes the original implementation of SELF-RAG: Learning to Retrieve, Generate and Critique through self-reflection by Akari Asai,β¦β2,175Updated last year
- Code for explaining and evaluating late chunking (chunked pooling)β440Updated 7 months ago
- [ICLR 2025] Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing. Your efficient and high-quality synthetic data β¦β756Updated 5 months ago
- Train Models Contrastively in Pytorchβ738Updated 4 months ago
- Automatically evaluate your LLMs in Google Colabβ654Updated last year
- Fine-Tuning Embedding for RAG with Synthetic Dataβ509Updated last year
- β890Updated 9 months ago
- [ACL2023] We introduce LLM-Blender, an innovative ensembling framework to attain consistently superior performance by leveraging the diveβ¦β957Updated 10 months ago
- β536Updated 9 months ago
- [NeurIPS 2024 Spotlight] Buffer of Thoughts: Thought-Augmented Reasoning with Large Language Modelsβ655Updated last month
- Framework for enhancing LLMs for RAG tasks using fine-tuning.β747Updated 3 months ago
- Official repository for ORPOβ462Updated last year
- Repository for "MultiHop-RAG: A Dataset for Evaluating Retrieval-Augmented Generation Across Documents" (COLM 2024)β355Updated 4 months ago
- Generative Representational Instruction Tuningβ664Updated last month
- A library for easily merging multiple LLM experts, and efficiently train the merged LLM.β488Updated 11 months ago
- Lighteval is your all-in-one toolkit for evaluating LLMs across multiple backendsβ1,820Updated last week
- Fast lexical search implementing BM25 in Python using Numpy, Numba and Scipyβ1,280Updated 2 months ago
- A reading list on LLM based Synthetic Data Generation π₯β1,386Updated 2 months ago
- β1,052Updated 11 months ago
- β415Updated last year
- RAGChecker: A Fine-grained Framework For Diagnosing RAGβ962Updated 8 months ago
- Best practices for distilling large language models.β572Updated last year