Graph-ZKY / CaFo
A pytorch implementation of Cascaded Forward (CaFo) Algorithm
☆22Updated 2 years ago
Alternatives and similar repositories for CaFo:
Users that are interested in CaFo are comparing it to the libraries listed below
- Implementation of Forward Forward Network proposed by Hinton in NIPS 2022.☆167Updated 2 years ago
- An implementation of unsupervised example of the Forward-Forward algorithm proposed by (Hinton, 2022)☆10Updated 9 months ago
- ☆102Updated last year
- Git Re-Basin: Merging Models modulo Permutation Symmetries in PyTorch☆75Updated 2 years ago
- Official code for the ICML 2024 paper "The Entropy Enigma: Success and Failure of Entropy Minimization"☆49Updated 10 months ago
- ☆49Updated last year
- Implementation of Soft MoE, proposed by Brain's Vision team, in Pytorch☆271Updated 11 months ago
- Understand and test language model architectures on synthetic tasks.☆185Updated 3 weeks ago
- PyTorch implementation of Hinton's FF Algorithm with hard negatives sampling☆14Updated 2 years ago
- The implementation for MLSys 2023 paper: "Cuttlefish: Low-rank Model Training without All The Tuning"☆44Updated last year
- Repository containing code for blockwise SSL training☆28Updated 5 months ago
- Implementation of "Gradients without backpropagation" paper (https://arxiv.org/abs/2202.08587) using functorch☆108Updated last year
- Source code of "Task arithmetic in the tangent space: Improved editing of pre-trained models".☆98Updated last year
- [EMNLP 2023 Main] Sparse Low-rank Adaptation of Pre-trained Language Models☆72Updated last year
- Implementation of Infini-Transformer in Pytorch☆110Updated 2 months ago
- CIFAR-10 speedruns: 94% in 2.6 seconds and 96% in 27 seconds☆221Updated 3 weeks ago
- Inference Speed Benchmark for Learning to (Learn at Test Time): RNNs with Expressive Hidden States☆64Updated 8 months ago
- HGRN2: Gated Linear RNNs with State Expansion☆53Updated 7 months ago
- Official code for "TOAST: Transfer Learning via Attention Steering"☆189Updated last year
- This is the official repository for the paper "Flora: Low-Rank Adapters Are Secretly Gradient Compressors" in ICML 2024.☆102Updated 8 months ago
- ☆67Updated 8 months ago
- Some preliminary explorations of Mamba's context scaling.☆212Updated last year
- PyTorch implementation of "From Sparse to Soft Mixtures of Experts"☆53Updated last year
- Mixture of A Million Experts☆42Updated 7 months ago
- Visualizing representations with diffusion based conditional generative model.☆91Updated last year
- This repository includes code to reproduce the tables in "Loss Landscapes are All You Need: Neural Network Generalization Can Be Explaine…☆35Updated 2 years ago
- Official code for the paper "Examining Post-Training Quantization for Mixture-of-Experts: A Benchmark"☆13Updated 9 months ago
- Code accompanying the paper "Massive Activations in Large Language Models"☆150Updated last year
- Sharpness-Aware Minimization Leads to Low-Rank Features [NeurIPS 2023]☆28Updated last year
- A repository for log-time feedforward networks☆220Updated 11 months ago