google-deepmind / onetwo
☆177Updated 3 months ago
Related projects ⓘ
Alternatives and complementary repositories for onetwo
- TapeAgents is a framework that facilitates all stages of the LLM Agent development lifecycle☆127Updated this week
- Automating enterprise workflows with multimodal agents☆95Updated last month
- A programming framework for agentic AI. Discord: https://discord.gg/pAbnFJrkgZ☆119Updated last week
- Archon provides a modular framework for combining different inference-time techniques and LMs with just a JSON config file.☆128Updated last month
- Manage scalable open LLM inference endpoints in Slurm clusters☆237Updated 4 months ago
- WIP - Allows you to create DSPy pipelines using ComfyUI☆180Updated 3 months ago
- Just a bunch of benchmark logs for different LLMs☆115Updated 3 months ago
- Automatic Evals for Instruction-Tuned Models☆65Updated this week
- ☆101Updated 3 months ago
- ☆93Updated last month
- Banishing LLM Hallucinations Requires Rethinking Generalization☆261Updated 4 months ago
- Code and Data for Tau-Bench☆204Updated this week
- Long context evaluation for large language models☆190Updated this week
- Website for hosting the Open Foundation Models Cheat Sheet.☆257Updated 4 months ago
- Draw more samples☆179Updated 5 months ago
- ☆162Updated 5 months ago
- ☆106Updated 3 months ago
- AWM: Agent Workflow Memory☆208Updated last month
- Comprehensive analysis of difference in performance of QLora, Lora, and Full Finetunes.☆81Updated last year
- ☆39Updated this week
- Doing simple retrieval from LLM models at various context lengths to measure accuracy☆97Updated 7 months ago
- Functional Benchmarks and the Reasoning Gap☆78Updated last month
- Tutorial for building LLM router☆163Updated 4 months ago
- ☆204Updated 4 months ago
- A tool that facilitates easy, efficient and high-quality fine-tuning of Cohere's models☆50Updated last week
- A framework for standardizing evaluations of large foundation models, beyond single-score reporting and rankings.☆89Updated this week
- ☆129Updated 3 weeks ago
- Let's build better datasets, together!☆206Updated this week
- Mixing Language Models with Self-Verification and Meta-Verification☆97Updated last year
- RAGElo is a set of tools that helps you selecting the best RAG-based LLM agents by using an Elo ranker☆106Updated this week