All-Hands-AI / trajectory-visualizer
☆23Updated 3 weeks ago
Alternatives and similar repositories for trajectory-visualizer:
Users that are interested in trajectory-visualizer are comparing it to the libraries listed below
- Implementation of "SelfCite: Self-Supervised Alignment for Context Attribution in Large Language Models"☆27Updated 2 months ago
- Agent computer interface for AI software engineer.☆70Updated this week
- ☆33Updated 10 months ago
- ☆27Updated last week
- [NAACL2025] LiteWebAgent: The Open-Source Suite for VLM-Based Web-Agent Applications☆72Updated this week
- 🔔🧠 Easily experiment with popular language agents across diverse reasoning/decision-making benchmarks!☆51Updated last month
- Official repository for R2E-Gym: Procedural Environment Generation and Hybrid Verifiers for Scaling Open-Weights SWE Agents☆66Updated 2 weeks ago
- Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment☆57Updated 8 months ago
- Moatless Testbeds allows you to create isolated testbed environments in a Kubernetes cluster where you can apply code changes through git…☆11Updated last month
- EMNLP 2024 "Re-reading improves reasoning in large language models". Simply repeating the question to get bidirectional understanding for…☆25Updated 5 months ago
- A framework for pitting LLMs against each other in an evolving library of games ⚔☆32Updated 3 weeks ago
- Implementation of the paper: "AssistantBench: Can Web Agents Solve Realistic and Time-Consuming Tasks?"☆54Updated 5 months ago
- Reasoning by Communicating with Agents☆28Updated last week
- The Library for LLM-based multi-agent applications☆79Updated 2 months ago
- Simple repository for training small reasoning models☆27Updated 3 months ago
- ☆27Updated 10 months ago
- ☆13Updated last month
- NeurIPS 2023 - Cappy: Outperforming and Boosting Large Multi-Task LMs with a Small Scorer☆43Updated last year
- Computer Agent Arena: Test & compare AI agents in real desktop apps & web environments. Code/data coming soon!☆44Updated last month
- Large language models (LLMs) made easy, EasyLM is a one stop solution for pre-training, finetuning, evaluating and serving LLMs in JAX/Fl…☆72Updated 8 months ago
- SiriuS: Self-improving Multi-agent Systems via Bootstrapped Reasoning☆53Updated last month
- Learning to Retrieve by Trying - Source code for Grounding by Trying: LLMs with Reinforcement Learning-Enhanced Retrieval☆33Updated 6 months ago
- Toy implementation of Strawberry☆31Updated 7 months ago
- ☆40Updated 9 months ago
- Agent Skill Induction: "Inducing Programmatic Skills for Agentic Tasks"☆14Updated 2 weeks ago
- accompanying material for sleep-time compute paper☆77Updated last week
- Aioli: A unified optimization framework for language model data mixing☆25Updated 3 months ago
- Advanced Reasoning Benchmark Dataset for LLMs☆46Updated last year
- ☆50Updated 5 months ago
- ☆81Updated 3 weeks ago