vertaix / Vendi-Score
☆97Updated 6 months ago
Related projects ⓘ
Alternatives and complementary repositories for Vendi-Score
- Training and evaluating NBM and SPAM for interpretable machine learning.☆76Updated last year
- This repository contains a Jax implementation of conformal training corresponding to the ICLR'22 paper "learning optimal conformal classi…☆121Updated 2 years ago
- ☆164Updated last year
- A statistical toolkit for scientific discovery using machine learning☆70Updated 4 months ago
- Implementation of Gated State Spaces, from the paper "Long Range Language Modeling via Gated State Spaces", in Pytorch☆95Updated last year
- Implementation of Discrete Key / Value Bottleneck, in Pytorch☆87Updated last year
- Replicating and dissecting the git-re-basin project in one-click-replication Colabs☆36Updated 2 years ago
- Framework code with wandb, checkpointing, logging, configs, experimental protocols. Useful for fine-tuning models or training from scratc…☆146Updated last year
- ☆96Updated 2 years ago
- Unofficial implementation of Conformal Language Modeling by Quach et al☆29Updated last year
- Transformers with doubly stochastic attention☆40Updated 2 years ago
- This repository includes code to reproduce the tables in "Loss Landscapes are All You Need: Neural Network Generalization Can Be Explaine…☆34Updated last year
- Code for papers Linear Algebra with Transformers (TMLR) and What is my Math Transformer Doing? (AI for Maths Workshop, Neurips 2022)☆64Updated 3 months ago
- Sequence Modeling with Multiresolution Convolutional Memory (ICML 2023)☆120Updated last year
- ☆187Updated 2 years ago
- Sparse and discrete interpretability tool for neural networks☆55Updated 9 months ago
- ☆75Updated last year
- ☆46Updated last month
- Experiments and code to generate the GINC small-scale in-context learning dataset from "An Explanation for In-context Learning as Implici…☆95Updated last year
- Revisiting Efficient Training Algorithms For Transformer-based Language Models (NeurIPS 2023)☆79Updated last year
- Yet another random morning idea to be quickly tried and architecture shared if it works; to allow the transformer to pause for any amount…☆49Updated last year
- Sequence Modeling with Structured State Spaces☆60Updated 2 years ago
- ☆109Updated 2 years ago
- Contrastive neighbor embeddings☆52Updated 6 months ago
- Why Do We Need Weight Decay in Modern Deep Learning? [NeurIPS 2024]☆52Updated last month
- Implementation of Bitune: Bidirectional Instruction-Tuning☆15Updated 5 months ago
- Data for "Datamodels: Predicting Predictions with Training Data"☆90Updated last year
- Language models scale reliably with over-training and on downstream tasks☆94Updated 7 months ago
- Experiment with diffusion models that you can run on your local jupyter instances☆55Updated 3 weeks ago
- Official implementation of FIND (NeurIPS '23) Function Interpretation Benchmark and Automated Interpretability Agents☆45Updated last month