aidecentralized / sonarLinks
SONAR - Self-Organizing Network of Aggregated Representations
☆22Updated 5 months ago
Alternatives and similar repositories for sonar
Users that are interested in sonar are comparing it to the libraries listed below
Sorting:
- Open source interpretability artefacts for R1.☆165Updated 8 months ago
- ☆235Updated last week
- Curated collection of community environments☆200Updated last week
- ☆17Updated 3 weeks ago
- ☆150Updated 4 months ago
- Vivaria is METR's tool for running evaluations and conducting agent elicitation research.☆128Updated 2 months ago
- Public repository containing METR's DVC pipeline for eval data analysis☆178Updated 9 months ago
- Repository with sample code using Apollo's suggested engineering practices☆15Updated last year
- A catalogue of existing Nanda servers☆190Updated 8 months ago
- open source interpretability platform 🧠☆621Updated this week
- SIMD quantization kernels☆92Updated 4 months ago
- Benchmarks for the Evaluation of LLM Supervision☆32Updated this week
- ☆107Updated last month
- METR Task Standard☆169Updated 11 months ago
- Contains random samples referenced in the paper "Sleeper Agents: Training Robustly Deceptive LLMs that Persist Through Safety Training".☆123Updated last year
- A reading list of relevant papers and projects on foundation model annotation☆28Updated 10 months ago
- anything you want can be built with morph cloud☆26Updated 2 months ago
- A 7B parameter model for mathematical reasoning☆41Updated 10 months ago
- Super basic implementation (gist-like) of RLMs with REPL environments.☆390Updated this week
- [EMNLP 2025 Demo] TinyScientist: A Lightweight Framework for Building Research Agents☆125Updated 2 months ago
- ☆178Updated last month
- Code for "Utility Engineering: Analyzing and Controlling Emergent Value Systems in AIs"☆86Updated 10 months ago
- ☆240Updated last month
- ☆18Updated this week
- ControlArena is a collection of settings, model organisms and protocols - for running control experiments.☆147Updated this week
- Library for text-to-text regression, applicable to any input string representation and allows pretraining and fine-tuning over multiple r…☆305Updated 3 weeks ago
- Course Materials for Interpretability of Large Language Models (0368.4264) at Tel Aviv University☆279Updated 3 weeks ago
- ☆104Updated 5 months ago
- This repo contains the source code for the paper "Evolution Strategies at Scale: LLM Fine-Tuning Beyond Reinforcement Learning"☆283Updated last month
- ⚖️ Awesome LLM Judges ⚖️☆148Updated 8 months ago