vvvm23 / mezo-jax
JAX implementation of "Fine-Tuning Language Models with Just Forward Passes"
☆20Updated last year
Related projects ⓘ
Alternatives and complementary repositories for mezo-jax
- ☆46Updated last month
- This repository includes code to reproduce the tables in "Loss Landscapes are All You Need: Neural Network Generalization Can Be Explaine…☆34Updated last year
- A simple hypernetwork implementation in jax using haiku.☆23Updated 2 years ago
- Official repository for the paper "Can You Learn an Algorithm? Generalizing from Easy to Hard Problems with Recurrent Networks"☆60Updated 2 years ago
- PyTorch implementation for "Long Horizon Temperature Scaling", ICML 2023☆19Updated last year
- ☆48Updated 9 months ago
- ICML 2022: Learning Iterative Reasoning through Energy Minimization☆44Updated last year
- Efficient Scaling laws and collaborative pretraining.☆13Updated last week
- [NeurIPS'20] Code for the Paper Compositional Visual Generation and Inference with Energy Based Models☆43Updated last year
- ☆58Updated 2 years ago
- ☆51Updated 5 months ago
- ☆40Updated 4 months ago
- Fast training of unitary deep network layers from low-rank updates☆28Updated last year
- Unofficial but Efficient Implementation of "Mamba: Linear-Time Sequence Modeling with Selective State Spaces" in JAX☆79Updated 9 months ago
- ☆25Updated last month
- Meta-learning inductive biases in the form of useful conserved quantities.☆37Updated 2 years ago
- Automatic Integration for Neural Spatio-Temporal Point Process models (AI-STPP) is a new paradigm for exact, efficient, non-parametric inf…☆24Updated last month
- ☆28Updated last year
- ☆31Updated 2 months ago
- ☆24Updated 5 years ago
- Official code for the paper "Attention as a Hypernetwork"☆23Updated 5 months ago
- ☆17Updated 2 years ago
- Official code for the paper: "Metadata Archaeology"☆18Updated last year
- A centralized place for deep thinking code and experiments☆77Updated last year
- Official code for "Accelerating Feedforward Computation via Parallel Nonlinear Equation Solving", ICML 2021☆26Updated 3 years ago
- ☆16Updated last month
- Clockwork VAEs in JAX/Flax☆32Updated 3 years ago
- Experiment of using Tangent to autodiff triton☆72Updated 10 months ago
- Engineering the state of RNN language models (Mamba, RWKV, etc.)☆32Updated 5 months ago