Graph-ZKY / CaFo
A pytorch implementation of Cascaded Forward (CaFo) Algorithm
☆22Updated 2 years ago
Alternatives and similar repositories for CaFo:
Users that are interested in CaFo are comparing it to the libraries listed below
- Implementation of Forward Forward Network proposed by Hinton in NIPS 2022.☆169Updated 2 years ago
- PyTorch implementation of Hinton's FF Algorithm with hard negatives sampling☆14Updated 2 years ago
- Official code for "TOAST: Transfer Learning via Attention Steering"☆190Updated last year
- Git Re-Basin: Merging Models modulo Permutation Symmetries in PyTorch☆75Updated 2 years ago
- Implementation/simulation of the predictive forward-forward credit assignment algorithm for training neurobiologically-plausible recurren…☆56Updated 2 years ago
- PyTorch implementation of "From Sparse to Soft Mixtures of Experts"☆53Updated last year
- ☆28Updated 10 months ago
- Reimplementation of Geoffrey Hinton's Forward-Forward Algorithm☆146Updated last year
- Code accompanying the paper "Massive Activations in Large Language Models"☆154Updated last year
- Model Fusion via Optimal Transport, NeurIPS 2020☆143Updated 2 years ago
- Code release for REPAIR: REnormalizing Permuted Activations for Interpolation Repair☆47Updated last year
- ☆102Updated last year
- Implementation of Soft MoE, proposed by Brain's Vision team, in Pytorch☆282Updated 2 weeks ago
- A curated list of Model Merging methods.☆91Updated 7 months ago
- A curated reading list of research in Adaptive Computation, Inference-Time Computation & Mixture of Experts (MoE).☆143Updated 3 months ago
- PyTorch implementation of Mixer-nano (#parameters is 0.67M, originally Mixer-S/16 has 18M) with 90.83 % acc. on CIFAR-10. Training from s…☆32Updated 3 years ago
- An implementation of unsupervised example of the Forward-Forward algorithm proposed by (Hinton, 2022)☆10Updated 10 months ago
- A repository for log-time feedforward networks☆221Updated last year
- Code for "Can We Scale Transformers to Predict Parameters of Diverse ImageNet Models?" [ICML 2023]☆32Updated 7 months ago
- PyTorch implementation of Soft MoE by Google Brain in "From Sparse to Soft Mixtures of Experts" (https://arxiv.org/pdf/2308.00951.pdf)☆71Updated last year
- ☆48Updated 4 months ago
- Model Zoos published at the NeurIPS 2022 Dataset & Benchmark track: "Model Zoos: A Dataset of Diverse Populations of Neural Network Model…☆54Updated last year
- Activation-aware Singular Value Decomposition for Compressing Large Language Models☆62Updated 5 months ago
- Repository containing code for blockwise SSL training☆29Updated 6 months ago
- LISA: Layerwise Importance Sampling for Memory-Efficient Large Language Model Fine-Tuning☆31Updated last year
- ☆49Updated last year
- Official implementation for Equivariant Architectures for Learning in Deep Weight Spaces [ICML 2023]☆89Updated last year
- Some personal experiments around routing tokens to different autoregressive attention, akin to mixture-of-experts☆118Updated 6 months ago
- Implementation of the paper: "Mixture-of-Depths: Dynamically allocating compute in transformer-based language models"☆89Updated this week
- ☆24Updated 6 months ago