JacksonWuxs / Forward-Forward-Network
Implementation of Forward Forward Network proposed by Hinton in NIPS 2022.
☆162Updated last year
Related projects ⓘ
Alternatives and complementary repositories for Forward-Forward-Network
- Reimplementation of Geoffrey Hinton's Forward-Forward Algorithm☆130Updated 11 months ago
- Implementation of Soft MoE, proposed by Brain's Vision team, in Pytorch☆242Updated 6 months ago
- Demonstrations of Loss of Plasticity and Implementation of Continual Backpropagation☆166Updated last week
- ☆179Updated 11 months ago
- [EVA ICLR'23; LARA ICML'22] Efficient attention mechanisms via control variates, random features, and importance sampling☆79Updated last year
- Implementation of ST-Moe, the latest incarnation of MoE after years of research at Brain, in Pytorch☆291Updated 4 months ago
- Crawl & visualize ICLR papers and reviews☆107Updated 2 years ago
- Implementation of Block Recurrent Transformer - Pytorch☆214Updated 2 months ago
- ☆97Updated 8 months ago
- OpenReivew Submission Visualization (ICLR 2024/2025)☆140Updated 3 weeks ago
- Implementation of Recurrent Memory Transformer, Neurips 2022 paper, in Pytorch☆393Updated 8 months ago
- Experiments around a simple idea for inducing multiple hierarchical predictive model within a GPT☆205Updated 2 months ago
- [EMNLP 2022] Official implementation of Transnormer in our EMNLP 2022 paper - The Devil in Linear Transformer☆54Updated last year
- ☆65Updated 7 months ago
- Forward Pass Learning and Inference Library, for neural networks and general intelligence, Signal Propagation (sigprop)☆45Updated last year
- PyTorch implementation of Mixer-nano (#parameters is 0.67M, originally Mixer-S/16 has 18M) with 90.83 % acc. on CIFAR-10. Training from s…☆28Updated 3 years ago
- Implementation of Griffin from the paper: "Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models"☆49Updated last week
- Implementation of Mega, the Single-head Attention with Multi-headed EMA architecture that currently holds SOTA on Long Range Arena☆203Updated last year
- Some preliminary explorations of Mamba's context scaling.☆190Updated 9 months ago
- Pytorch implementation of Simplified Structured State-Spaces for Sequence Modeling (S5)☆63Updated 6 months ago
- Recurrent Memory Transformer☆150Updated last year
- Randomized Positional Encodings Boost Length Generalization of Transformers☆79Updated 7 months ago
- A curated list for awesome discrete diffusion models resources.☆61Updated this week
- ☆114Updated 8 months ago
- Revisiting Efficient Training Algorithms For Transformer-based Language Models (NeurIPS 2023)☆79Updated last year
- ☆132Updated last year
- Official implementation of TransNormerLLM: A Faster and Better LLM☆229Updated 9 months ago
- Inference Speed Benchmark for Learning to (Learn at Test Time): RNNs with Expressive Hidden States☆38Updated 3 months ago
- Implementation of Infini-Transformer in Pytorch☆104Updated last month
- Official implementation of "Hydra: Bidirectional State Space Models Through Generalized Matrix Mixers"☆102Updated 3 months ago