safety-research / circuit-tracerLinks
☆2,546Updated last week
Alternatives and similar repositories for circuit-tracer
Users that are interested in circuit-tracer are comparing it to the libraries listed below
Sorting:
- open source interpretability platform 🧠☆632Updated last week
- Textbook on reinforcement learning from human feedback☆1,416Updated this week
- Post-training with Tinker☆2,756Updated this week
- [NeurIPS 2025 Spotlight] Reasoning Environments for Reinforcement Learning with Verifiable Rewards☆1,307Updated this week
- Our library for RL environments + evals☆3,748Updated this week
- MLE-bench is a benchmark for measuring how well AI agents perform at machine learning engineering☆1,276Updated last week
- ☆1,381Updated 4 months ago
- Renderer for the harmony response format to be used with gpt-oss☆4,135Updated last month
- A benchmark for LLMs on complicated tasks in the terminal☆1,350Updated 3 weeks ago
- Training Large Language Model to Reason in a Continuous Latent Space☆1,463Updated 5 months ago
- OpenAI Frontier Evals☆983Updated last month
- Synthetic data curation for post-training and structured data extraction☆1,602Updated 2 weeks ago
- A library for mechanistic interpretability of GPT-style language models☆3,005Updated this week
- Darwin Gödel Machine: Open-Ended Evolution of Self-Improving Agents☆1,794Updated 5 months ago
- Lighteval is your all-in-one toolkit for evaluating LLMs across multiple backends☆2,274Updated last week
- The 100 line AI agent that solves GitHub issues or helps you in your command line. Radically simple, no huge configs, no giant monorepo—b…☆2,530Updated last week
- Large Concept Models: Language modeling in a sentence representation space☆2,327Updated 11 months ago
- Hypernetworks that adapt LLMs for specific benchmark tasks using only textual task description as the input☆939Updated 7 months ago
- Code and Data for Tau-Bench☆1,058Updated 4 months ago
- τ²-Bench: Evaluating Conversational Agents in a Dual-Control Environment☆662Updated last month
- Training Sparse Autoencoders on Language Models☆1,169Updated this week
- Sharing both practical insights and theoretical knowledge about LLM evaluation that we gathered while managing the Open LLM Leaderboard a…☆2,033Updated last month
- Continuous Thought Machines, because thought takes time and reasoning is a process.☆1,712Updated 3 weeks ago
- An agent benchmark with tasks in a simulated software company.☆626Updated 2 months ago
- An interface library for RL post training with environments.☆1,066Updated this week
- Humanity's Last Exam☆1,304Updated 3 months ago
- [COLM 2025] LIMO: Less is More for Reasoning☆1,062Updated 5 months ago
- ☆550Updated 7 months ago
- A Self-adaptation Framework🐙 that adapts LLMs for unseen tasks in real-time!☆1,182Updated 11 months ago
- Atropos is a Language Model Reinforcement Learning Environments framework for collecting and evaluating LLM trajectories through diverse …☆831Updated this week