BatsResearch / bonitoLinks
A lightweight library for generating synthetic instruction tuning datasets for your data without GPT.
โ789Updated last month
Alternatives and similar repositories for bonito
Users that are interested in bonito are comparing it to the libraries listed below
Sorting:
- DataDreamer: Prompt. Generate Synthetic Data. Train & Align Models. โ ๐ค๐คโ1,055Updated 7 months ago
- Evaluate your LLM's response with Prometheus and GPT4 ๐ฏโ981Updated 4 months ago
- Automated Evaluation of RAG Systemsโ650Updated 5 months ago
- Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verifiโฆโ2,867Updated last week
- Automatically evaluate your LLMs in Google Colabโ658Updated last year
- Easily embed, cluster and semantically label text datasetsโ567Updated last year
- The official implementation of RAPTOR: Recursive Abstractive Processing for Tree-Organized Retrievalโ1,404Updated last year
- Train Models Contrastively in Pytorchโ744Updated 5 months ago
- [ICLR 2025] Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing. Your efficient and high-quality synthetic data โฆโ765Updated 5 months ago
- Code for explaining and evaluating late chunking (chunked pooling)โ447Updated 8 months ago
- A lightweight, low-dependency, unified API to use all common reranking and cross-encoder models.โ1,528Updated 3 months ago
- Generative Representational Instruction Tuningโ670Updated 2 months ago
- This includes the original implementation of SELF-RAG: Learning to Retrieve, Generate and Critique through self-reflection by Akari Asai,โฆโ2,191Updated last year
- [NeurIPS 2024 Spotlight] Buffer of Thoughts: Thought-Augmented Reasoning with Large Language Modelsโ658Updated 2 months ago
- โ538Updated 9 months ago
- Data and tools for generating and inspecting OLMo pre-training data.โ1,307Updated last week
- Fine-Tuning Embedding for RAG with Synthetic Dataโ510Updated 2 years ago
- An Open Source Toolkit For LLM Distillationโ721Updated 2 months ago
- [ACL2023] We introduce LLM-Blender, an innovative ensembling framework to attain consistently superior performance by leveraging the diveโฆโ959Updated 10 months ago
- โ1,034Updated 8 months ago
- A library for easily merging multiple LLM experts, and efficiently train the merged LLM.โ491Updated last year
- Official repository for ORPOโ465Updated last year
- Efficient Retrieval Augmentation and Generation Frameworkโ1,657Updated 8 months ago
- Framework for enhancing LLMs for RAG tasks using fine-tuning.โ748Updated 3 months ago
- โ891Updated 10 months ago
- Repository for "MultiHop-RAG: A Dataset for Evaluating Retrieval-Augmented Generation Across Documents" (COLM 2024)โ363Updated 5 months ago
- Lighteval is your all-in-one toolkit for evaluating LLMs across multiple backendsโ1,866Updated last week
- โ1,071Updated 11 months ago
- Open-source tool to visualise your RAG ๐ฎโ1,153Updated 8 months ago
- [EMNLP 2024: Demo Oral] RAGLAB: A Modular and Research-Oriented Unified Framework for Retrieval-Augmented Generationโ305Updated 10 months ago