Trel725 / forward-forwardLinks
A simple Python implementation of forward-forward NN training by G. Hinton from NeurIPS 2022
☆21Updated 2 years ago
Alternatives and similar repositories for forward-forward
Users that are interested in forward-forward are comparing it to the libraries listed below
Sorting:
- Latest Weight Averaging (NeurIPS HITY 2022)☆30Updated 2 years ago
- Revisiting Efficient Training Algorithms For Transformer-based Language Models (NeurIPS 2023)☆80Updated last year
- ☆20Updated 11 months ago
- Replicating and dissecting the git-re-basin project in one-click-replication Colabs☆36Updated 2 years ago
- ☆48Updated last year
- Universal Neurons in GPT2 Language Models☆29Updated last year
- Official implementation of the transformer (TF) architecture suggested in a paper entitled "Looped Transformers as Programmable Computers…☆27Updated 2 years ago
- ModelDiff: A Framework for Comparing Learning Algorithms☆57Updated last year
- Code Release for "Broken Neural Scaling Laws" (BNSL) paper☆59Updated last year
- ☆37Updated last year
- ☆34Updated 2 years ago
- This repository includes code to reproduce the tables in "Loss Landscapes are All You Need: Neural Network Generalization Can Be Explaine…☆37Updated 2 years ago
- ☆53Updated 8 months ago
- [ICLR 2023] "Sparsity May Cry: Let Us Fail (Current) Sparse Neural Networks Together!" Shiwei Liu, Tianlong Chen, Zhenyu Zhang, Xuxi Chen…☆28Updated last year
- CIFAR10 ResNets implemented in JAX+Flax☆12Updated 3 years ago
- Why Do We Need Weight Decay in Modern Deep Learning? [NeurIPS 2024]☆66Updated 9 months ago
- ☆12Updated 3 months ago
- Triton Implementation of HyperAttention Algorithm☆48Updated last year
- Reference implementation for Reward-Augmented Decoding: Efficient Controlled Text Generation With a Unidirectional Reward Model☆43Updated last year
- ☆26Updated 2 years ago
- Blog post☆17Updated last year
- ☆53Updated last year
- A modern look at the relationship between sharpness and generalization [ICML 2023]☆43Updated last year
- ☆54Updated 2 years ago
- A Kernel-Based View of Language Model Fine-Tuning https://arxiv.org/abs/2210.05643☆75Updated last year
- ☆13Updated 5 months ago
- Deep Networks Grok All the Time and Here is Why☆37Updated last year
- Implementation of experiments from The No Free Lunch Theorem, Kolmogorov Complexity, and the Role of Inductive Biases in Machine Learning☆17Updated 2 years ago
- Efficient PScan implementation in PyTorch☆16Updated last year
- Official code for the paper "Attention as a Hypernetwork"☆39Updated last year