google-deepmind / onetwoLinks
☆226Updated 3 months ago
Alternatives and similar repositories for onetwo
Users that are interested in onetwo are comparing it to the libraries listed below
Sorting:
- Hugging Face Deep Learning Containers (DLCs) for Google Cloud☆145Updated last month
- ☆131Updated 2 months ago
- A framework for standardizing evaluations of large foundation models, beyond single-score reporting and rankings.☆150Updated this week
- Scale your LLM-as-a-judge.☆234Updated last week
- A comprehensive guide to LLM evaluation methods designed to assist in identifying the most suitable evaluation techniques for various use…☆118Updated 3 weeks ago
- Manage scalable open LLM inference endpoints in Slurm clusters☆258Updated 10 months ago
- Website for hosting the Open Foundation Models Cheat Sheet.☆268Updated last month
- ☆152Updated 6 months ago
- Source code of "How to Correctly do Semantic Backpropagation on Language-based Agentic Systems" 🤖☆68Updated 6 months ago
- ☆123Updated 7 months ago
- ☆120Updated 10 months ago
- Functional Benchmarks and the Reasoning Gap☆86Updated 8 months ago
- Multi-backend recommender systems with Keras 3☆125Updated this week
- ☆111Updated 5 months ago
- Fast bare-bones BPE for modern tokenizer training☆157Updated 2 months ago
- Doing simple retrieval from LLM models at various context lengths to measure accuracy☆99Updated last year
- Banishing LLM Hallucinations Requires Rethinking Generalization☆276Updated 10 months ago
- ☆179Updated this week
- Source code for the collaborative reasoner research project at Meta FAIR.☆87Updated last month
- Just a bunch of benchmark logs for different LLMs☆119Updated 10 months ago
- METR Task Standard☆148Updated 4 months ago
- Let's build better datasets, together!☆259Updated 5 months ago
- Code for the paper "Rethinking Benchmark and Contamination for Language Models with Rephrased Samples"☆302Updated last year
- ☆180Updated last month
- ☆121Updated 2 months ago
- Draw more samples☆191Updated 11 months ago
- Notebooks accompanying Anthropic's "Toy Models of Superposition" paper☆126Updated 2 years ago
- The history files when recording human interaction while solving ARC tasks☆110Updated 2 weeks ago
- Starter pack for NeurIPS LLM Efficiency Challenge 2023.☆122Updated last year
- ☆52Updated last year