A lightweight library for generating synthetic instruction tuning datasets for your data without GPT.
☆823Jul 15, 2025Updated 8 months ago
Alternatives and similar repositories for bonito
Users that are interested in bonito are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verifi…☆3,131Mar 16, 2026Updated last week
- Easily embed, cluster and semantically label text datasets☆600Mar 28, 2024Updated last year
- Tools for merging pretrained large language models.☆6,895Mar 15, 2026Updated last week
- [ICLR 2025] Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing. Your efficient and high-quality synthetic data …☆833Mar 17, 2025Updated last year
- Easily use and train state of the art late-interaction retrieval methods (ColBERT) in any RAG pipeline. Designed for modularity and ease-…☆3,889May 17, 2025Updated 10 months ago
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- Go ahead and axolotl questions☆11,508Updated this week
- Evaluate your LLM's response with Prometheus and GPT4 💯☆1,057Apr 25, 2025Updated 11 months ago
- Set of scripts to finetune LLMs☆38Mar 30, 2024Updated last year
- Stanford NLP Python library for Representation Finetuning (ReFT)☆1,563Mar 5, 2026Updated 2 weeks ago
- ☆567Nov 20, 2024Updated last year
- Freeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.☆2,965Mar 16, 2026Updated last week
- Create Custom LLMs☆1,820Nov 8, 2025Updated 4 months ago
- Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs☆3,739May 21, 2025Updated 10 months ago
- Structured Outputs☆13,588Updated this week
- NordVPN Special Discount Offer • AdSave on top-rated NordVPN 1 or 2-year plans with secure browsing, privacy protection, and support for for all major platforms.
- DataDreamer: Prompt. Generate Synthetic Data. Train & Align Models. 🤖💤☆1,101Feb 2, 2025Updated last year
- A lightweight, low-dependency, unified API to use all common reranking and cross-encoder models.☆1,605Dec 20, 2025Updated 3 months ago
- Generate textbook-quality synthetic LLM pretraining data☆509Oct 19, 2023Updated 2 years ago
- The official implementation of Self-Play Fine-Tuning (SPIN)☆1,234May 8, 2024Updated last year
- Robust recipes to align language models with human and AI preferences☆5,535Sep 8, 2025Updated 6 months ago
- Curated list of datasets and tools for post-training.☆4,344Mar 9, 2026Updated 2 weeks ago
- DSPy: The framework for programming—not prompting—language models☆33,038Updated this week
- Chat language model that can use tools and interpret the results☆1,594Dec 3, 2025Updated 3 months ago
- [ACL 2024] Progressive LLaMA with Block Expansion.☆514May 20, 2024Updated last year
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Using multiple LLMs for ensemble Forecasting☆16Jan 17, 2024Updated 2 years ago
- Co-LLM: Learning to Decode Collaboratively with Multiple Language Models☆126May 7, 2024Updated last year
- FuseAI Project☆592Jan 25, 2025Updated last year
- Training LLMs with QLoRA + FSDP☆1,540Nov 9, 2024Updated last year
- 20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.☆13,256Updated this week
- A single repo with all scripts and utils to train / fine-tune the Mamba model with or without FIM☆61Apr 8, 2024Updated last year
- 📚 Datasets and models for instruction-tuning☆238Sep 23, 2023Updated 2 years ago
- clean & curate your data with LLMs.☆489Jun 26, 2024Updated last year
- Supercharge Your LLM Application Evaluations 🚀☆13,008Feb 24, 2026Updated last month
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Large Action Model framework to develop AI Web Agents☆6,318Jan 21, 2025Updated last year
- Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)☆68,728Mar 18, 2026Updated last week
- Aligning pretrained language models with instruction data generated by themselves.☆4,587Mar 27, 2023Updated 2 years ago
- structured outputs for llms☆12,589Updated this week
- Unsloth Studio is a web UI for training and running open models like Qwen, DeepSeek, gpt-oss and Gemma locally.☆57,673Updated this week
- [TMLR 2025] When Attention Collapses: How Degenerate Layers in LLMs Enable Smaller, Stronger Models☆125Mar 6, 2026Updated 2 weeks ago
- Backtracing: Retrieving the Cause of the Query, EACL 2024 Long Paper, Findings.☆92Jul 21, 2024Updated last year