google-deepmind / onetwo
☆188Updated 6 months ago
Alternatives and similar repositories for onetwo:
Users that are interested in onetwo are comparing it to the libraries listed below
- Doing simple retrieval from LLM models at various context lengths to measure accuracy☆100Updated 10 months ago
- ☆147Updated 2 months ago
- Mixing Language Models with Self-Verification and Meta-Verification☆101Updated 2 months ago
- ☆77Updated 8 months ago
- Comprehensive analysis of difference in performance of QLora, Lora, and Full Finetunes.☆82Updated last year
- LLM Comparator is an interactive data visualization tool for evaluating and analyzing LLM responses side-by-side, developed by the PAIR t…☆374Updated last week
- Hugging Face Deep Learning Containers (DLCs) for Google Cloud☆141Updated 3 weeks ago
- ☆122Updated last week
- A small library of LLM judges☆143Updated 2 weeks ago
- Manage scalable open LLM inference endpoints in Slurm clusters☆252Updated 7 months ago
- A holistic evaluation library for multi-modal generative models using Weave☆27Updated 3 months ago
- ☆48Updated 8 months ago
- Website for hosting the Open Foundation Models Cheat Sheet.☆262Updated 7 months ago
- ☆76Updated 8 months ago
- ☆117Updated 3 months ago
- Functional Benchmarks and the Reasoning Gap☆82Updated 4 months ago
- ☆165Updated 8 months ago
- ☆51Updated 2 weeks ago
- Check for data drift between two OpenAI multi-turn chat jsonl files.☆37Updated 10 months ago
- Minimal example scripts of the Hugging Face Trainer, focused on staying under 150 lines☆197Updated 9 months ago
- Repository for the paper Stream of Search: Learning to Search in Language☆137Updated 2 weeks ago
- Set of scripts to finetune LLMs☆36Updated 10 months ago
- awesome synthetic (text) datasets☆261Updated 3 months ago
- ☆141Updated 7 months ago
- A comprehensive guide to LLM evaluation methods designed to assist in identifying the most suitable evaluation techniques for various use…☆93Updated this week
- Fast bare-bones BPE for modern tokenizer training☆146Updated 4 months ago
- Fiddler Auditor is a tool to evaluate language models.☆175Updated 11 months ago