sap-ient-ai / FFF
FastFeedForward Networks
☆18Updated 11 months ago
Related projects ⓘ
Alternatives and complementary repositories for FFF
- ☆49Updated 7 months ago
- Memoria is a human-inspired memory architecture for neural networks.☆57Updated 3 weeks ago
- Jax like function transformation engine but micro, microjax☆26Updated 2 weeks ago
- ☆27Updated 4 months ago
- ☆76Updated 6 months ago
- ☆74Updated last week
- ☆44Updated 2 months ago
- ☆53Updated 9 months ago
- ☆55Updated 11 months ago
- ☆23Updated 8 months ago
- This is the code that went into our practical dive using mamba as information extraction☆50Updated 10 months ago
- Implementation of GateLoop Transformer in Pytorch and Jax☆86Updated 4 months ago
- Demo of the unit_scaling library, showing how a model can be easily adapted to train in FP8.☆35Updated 3 months ago
- A MAD laboratory to improve AI architecture designs 🧪☆95Updated 6 months ago
- Collection of autoregressive model implementation☆66Updated this week
- RWKV-7: Surpassing GPT☆40Updated this week
- An introduction to LLM Sampling☆18Updated this week
- ☆36Updated 3 months ago
- GoldFinch and other hybrid transformer components☆39Updated 3 months ago
- Code repository for the c-BTM paper☆105Updated last year
- Repo for "LoLCATs: On Low-Rank Linearizing of Large Language Models"☆171Updated 3 weeks ago
- A byte-level decoder architecture that matches the performance of tokenized Transformers.☆58Updated 6 months ago
- Transformer with Mu-Parameterization, implemented in Jax/Flax. Supports FSDP on TPU pods.☆29Updated last week
- Implementation of the Mamba SSM with hf_integration.☆55Updated 2 months ago
- Official repository for the paper "Approximating Two-Layer Feedforward Networks for Efficient Transformers"☆36Updated 11 months ago
- Latent Large Language Models☆16Updated 2 months ago
- gzip Predicts Data-dependent Scaling Laws☆32Updated 5 months ago
- Token Omission Via Attention☆119Updated 3 weeks ago
- Experiments for efforts to train a new and improved t5☆76Updated 6 months ago