FOR-sight-ai / interpretoLinks
πͺ Interpreto is an interpretability toolbox for LLMs
β55Updated last week
Alternatives and similar repositories for interpreto
Users that are interested in interpreto are comparing it to the libraries listed below
Sorting:
- π Overcomplete is a Vision-based SAE Toolboxβ96Updated 3 months ago
- β53Updated last year
- Interpreting how transformers simulate agents performing RL tasksβ88Updated 2 years ago
- Latent Program Network (from the "Searching Latent Program Spaces" paper)β102Updated last month
- Sparse and discrete interpretability tool for neural networksβ64Updated last year
- π Influenciae is a Tensorflow Toolbox for Influence Functionsβ64Updated last year
- Build and train Lipschitz constrained networks: TensorFlow implementation of k-Lipschitz layersβ100Updated 7 months ago
- β57Updated 3 years ago
- Cost aware hyperparameter tuning algorithmβ172Updated last year
- DiffuLab is designed to provide a simple and flexible way to train diffusion models while allowing full customization of its core componeβ¦β40Updated this week
- β120Updated 4 months ago
- π CODS - Conformal Object Detection and Segmentationβ18Updated this week
- Synchronized Curriculum Learning for RL Agentsβ114Updated 2 months ago
- Comparison between GFlowNets & Maximum Entropy RLβ19Updated last year
- Deep Networks Grok All the Time and Here is Whyβ37Updated last year
- [ICLR 2025] Official implementation of DICL (Disentangled In-Context Learning), featured in the paper "Zero-shot Model-based Reinforcemenβ¦β27Updated 8 months ago
- Code for the paper "Learning Temporal Distances: Contrastive Successor Features Can Provide a Metric Structure for Decision-Making"β28Updated last year
- Solving the Abstraction & Reasoning Corpus with DreamCoderβ53Updated last year
- Repository for the PGA-MAP-Elites algorithm. PGA-MAP-Elites was developed to efficiently scale MAP-Elites to large genotypes and noisy dβ¦β58Updated 4 years ago
- Sparse Autoencoder Training Libraryβ55Updated 6 months ago
- A reinforcement learning environment for the IGLU 2022 at NeurIPSβ34Updated 2 years ago
- Code for minimum-entropy coupling.β32Updated last year
- Parameter-Free Optimizers for Pytorchβ131Updated last year
- Build and train Lipschitz-constrained networks: PyTorch implementation of 1-Lipschitz layers. For TensorFlow/Keras implementation, see htβ¦β34Updated 2 weeks ago
- Universal Neurons in GPT2 Language Modelsβ30Updated last year
- β27Updated last month
- WandB sweeps integration with Hydra sweeperβ50Updated last year
- Benchmarking RL for POMDPs in Pure JAX [Code for "Structured State Space Models for In-Context Reinforcement Learning" (NeurIPS 2023)]β110Updated last year
- A TinyStories LM with SAEs and transcodersβ13Updated 7 months ago
- Codebase to fully reproduce the results of "No Representation, No Trust: Connecting Representation, Collapse, and Trust Issues in PPO" (Mβ¦β29Updated 11 months ago