⚓️ Repository for the "Thought Anchors: Which LLM Reasoning Steps Matter?" paper.
☆116Oct 27, 2025Updated 4 months ago
Alternatives and similar repositories for thought-anchors
Users that are interested in thought-anchors are comparing it to the libraries listed below
Sorting:
- ⚓️ Interactive playground for the "Thought Anchors: Which LLM Reasoning Steps Matter?" paper.☆18Dec 20, 2025Updated 2 months ago
- A Framework for Evaluating AI Agent Safety in Realistic Environments☆30Oct 2, 2025Updated 5 months ago
- ☆33Jul 9, 2025Updated 7 months ago
- ☆22Feb 13, 2026Updated 2 weeks ago
- A library for training crosscoders☆16May 28, 2025Updated 9 months ago
- Implementations of several self-supervised pretext tasks for language and vision modalities in PyTorch.☆13Jan 19, 2021Updated 5 years ago
- 一个开源数学大模型项目,旨在探索大模型是否具有数学创造能力,以及大模型在前沿数学研究中的潜在能力。☆17May 16, 2025Updated 9 months ago
- James' cookbook of evaluations and finetuning experiments☆21Feb 19, 2026Updated last week
- A tiny easily hackable implementation of a feature dashboard.☆15Oct 21, 2025Updated 4 months ago
- Emergent Hierarchical Reasoning in LLMs/VLMs through Reinforcement Learning☆62Oct 24, 2025Updated 4 months ago
- ✱ Understanding the underlying learning dynamics of simple tasks in Transformer networks☆18Aug 16, 2024Updated last year
- RAG based chatbot for Global AI Hub☆26Oct 4, 2025Updated 4 months ago
- Code for my NeurIPS 2024 ATTRIB paper titled "Attribution Patching Outperforms Automated Circuit Discovery"☆47May 31, 2024Updated last year
- ☆17Feb 14, 2024Updated 2 years ago
- UQ: Assessing Language Models on Unsolved Questions☆30Aug 26, 2025Updated 6 months ago
- Official Code for our paper: "Language Models Learn to Mislead Humans via RLHF""☆19Oct 11, 2024Updated last year
- Persistent caching for Python functions☆17Dec 10, 2025Updated 2 months ago
- ☆17Feb 4, 2025Updated last year
- Code for steering and monitoring with concepts vectors in LLMs. https://arxiv.org/abs/2502.03708☆22Aug 10, 2025Updated 6 months ago
- Open source interpretability artefacts for R1.☆171Apr 21, 2025Updated 10 months ago
- [ICLR 2026] Fast-Slow Toolpath Agent with Subroutine Mining for Efficient Multi-turn Image Editing☆29Feb 6, 2026Updated 3 weeks ago
- [ICLR'25] Code for KaSA, an official implementation of "KaSA: Knowledge-Aware Singular-Value Adaptation of Large Language Models"☆20Jan 16, 2025Updated last year
- Official codebase for "Analyzing the Generalization and Reliability of Steering Vectors"☆19Dec 14, 2024Updated last year
- ☆15Feb 21, 2024Updated 2 years ago
- Source code for our paper: "ARIA: Training Language Agents with Intention-Driven Reward Aggregation".☆26Aug 9, 2025Updated 6 months ago
- GoldFinch and other hybrid transformer components☆45Jul 20, 2024Updated last year
- ☆273Oct 1, 2024Updated last year
- Reasoning Agentic Retrieval-Augmented Generation for Industry Challenges☆28May 14, 2025Updated 9 months ago
- ☆21Jul 25, 2025Updated 7 months ago
- ☆64Feb 4, 2026Updated 3 weeks ago
- Unified access to Large Language Model modules using NNsight☆93Updated this week
- Official PyTorch/Diffusers implementation of "RectifiedHR: Enable Efficient High Resolution Image Generation via Energy Rectification"☆30Oct 11, 2025Updated 4 months ago
- ☆30Jan 18, 2026Updated last month
- ☆58Nov 19, 2024Updated last year
- ☆100Aug 8, 2024Updated last year
- Cramming 1568 Tokens into a Single Vector and Back Again: Exploring the Limits of Embedding Space Capacity (ACL 2025, oral)☆30Jun 14, 2025Updated 8 months ago
- A Dynamic Visual Benchmark for Evaluating Mathematical Reasoning Robustness of Vision Language Models☆28Nov 25, 2024Updated last year
- Sparse Autoencoder Training Library☆55May 1, 2025Updated 10 months ago
- ☆25Dec 20, 2023Updated 2 years ago