sap-ient-ai / FFFLinks
FastFeedForward Networks
☆20Updated 2 years ago
Alternatives and similar repositories for FFF
Users that are interested in FFF are comparing it to the libraries listed below
Sorting:
- ☆53Updated 2 years ago
- ☆82Updated last year
- Official repository for the paper "Approximating Two-Layer Feedforward Networks for Efficient Transformers"☆38Updated 7 months ago
- ☆109Updated 6 months ago
- ☆27Updated last year
- ☆50Updated last year
- Jax like function transformation engine but micro, microjax☆34Updated last year
- Memoria is a human-inspired memory architecture for neural networks.☆82Updated last year
- RWKV-7: Surpassing GPT☆104Updated last year
- GoldFinch and other hybrid transformer components☆45Updated last year
- ☆59Updated 2 months ago
- ☆35Updated last year
- ☆29Updated last year
- Explorations into the proposal from the paper "Grokfast, Accelerated Grokking by Amplifying Slow Gradients"☆103Updated last year
- A repository for log-time feedforward networks☆224Updated last year
- ☆62Updated last year
- Implementation of GateLoop Transformer in Pytorch and Jax☆92Updated last year
- slowly building a set of infinite riddle generators for data-hungry methods☆14Updated 3 years ago
- Collection of autoregressive model implementation☆85Updated 2 weeks ago
- Evaluating the Mamba architecture on the Othello game☆49Updated last year
- Codes accompanying the paper "LaProp: a Better Way to Combine Momentum with Adaptive Gradient"☆29Updated 5 years ago
- Code implementing "Efficient Parallelization of a Ubiquitious Sequential Computation" (Heinsen, 2023)☆98Updated last year
- Jax Codebase for Evolutionary Strategies at the Hyperscale☆216Updated last month
- Tree Attention: Topology-aware Decoding for Long-Context Attention on GPU clusters☆131Updated last year
- Token Omission Via Attention☆128Updated last year
- H-Net Dynamic Hierarchical Architecture☆81Updated 4 months ago
- Demo of the unit_scaling library, showing how a model can be easily adapted to train in FP8.☆46Updated last year
- Learning Universal Predictors☆81Updated last year
- Simple GRPO scripts and configurations.☆59Updated 11 months ago
- A repository for research on medium sized language models.☆77Updated last year