google-deepmind / onetwoLinks
☆243Updated 7 months ago
Alternatives and similar repositories for onetwo
Users that are interested in onetwo are comparing it to the libraries listed below
Sorting:
- Website for hosting the Open Foundation Models Cheat Sheet.☆267Updated 5 months ago
- ☆142Updated last month
- Banishing LLM Hallucinations Requires Rethinking Generalization☆275Updated last year
- ☆211Updated last week
- Let's build better datasets, together!☆262Updated 9 months ago
- TapeAgents is a framework that facilitates all stages of the LLM Agent development lifecycle☆297Updated this week
- A framework for standardizing evaluations of large foundation models, beyond single-score reporting and rankings.☆168Updated this week
- ☆61Updated 2 months ago
- Source code of "How to Correctly do Semantic Backpropagation on Language-based Agentic Systems" 🤖☆75Updated 10 months ago
- Training-Ready RL Environments + Evals☆116Updated last week
- Doing simple retrieval from LLM models at various context lengths to measure accuracy☆104Updated 3 weeks ago
- Source code for the collaborative reasoner research project at Meta FAIR.☆102Updated 5 months ago
- LLM Comparator is an interactive data visualization tool for evaluating and analyzing LLM responses side-by-side, developed by the PAIR t…☆486Updated 7 months ago
- Modular, scalable library to train ML models☆165Updated last week
- ☆124Updated 11 months ago
- Manage scalable open LLM inference endpoints in Slurm clusters☆273Updated last year
- Draw more samples☆194Updated last year
- Matrix (Multi-Agent daTa geneRation Infra and eXperimentation framework) is a versatile engine for multi-agent conversational data genera…☆96Updated this week
- Toolkit for attaching, training, saving and loading of new heads for transformer models☆288Updated 7 months ago
- ☆159Updated 10 months ago
- Library for text-to-text regression, applicable to any input string representation and allows pretraining and fine-tuning over multiple r…☆268Updated this week
- Simple UI for debugging correlations of text embeddings☆295Updated 4 months ago
- ☆225Updated 3 months ago
- ☆67Updated last year
- Public repository containing METR's DVC pipeline for eval data analysis☆117Updated 6 months ago
- ☆123Updated last year
- ☆146Updated last year
- Hugging Face Deep Learning Containers (DLCs) for Google Cloud☆153Updated 5 months ago
- Inference-time scaling for LLMs-as-a-judge.☆300Updated last week
- A comprehensive guide to LLM evaluation methods designed to assist in identifying the most suitable evaluation techniques for various use…☆143Updated last week