facebookresearch / GCD
Computing the greatest common divisor with transformers, source code for the paper https//arxiv.org/abs/2308.15594
☆12Updated 7 months ago
Related projects ⓘ
Alternatives and complementary repositories for GCD
- ☆18Updated 6 months ago
- FlexAttention w/ FlashAttention3 Support☆26Updated last month
- Minimum Description Length probing for neural network representations☆16Updated last week
- ☆17Updated 2 weeks ago
- Code for the paper: https://arxiv.org/pdf/2309.06979.pdf☆16Updated 3 months ago
- Submission to the inverse scaling prize☆23Updated last year
- Engineering the state of RNN language models (Mamba, RWKV, etc.)☆32Updated 5 months ago
- Implementation of Hyena Hierarchy in JAX☆10Updated last year
- No-GIL Python environment featuring NVIDIA Deep Learning libraries.☆20Updated this week
- [ICML 24 NGSM workshop] Associative Recurrent Memory Transformer implementation and scripts for training and evaluating☆29Updated this week
- Heavyweight Python dynamic analysis framework☆13Updated 6 months ago
- Source-to-Source Debuggable Derivatives in Pure Python☆14Updated 9 months ago
- Loop Nest - Linear algebra compiler and code generator.☆22Updated 2 years ago
- Personal solutions to the Triton Puzzles☆15Updated 3 months ago
- Experiment of using Tangent to autodiff triton☆71Updated 9 months ago
- ☆21Updated last month
- Embroid: Unsupervised Prediction Smoothing Can Improve Few-Shot Classification☆11Updated last year
- Awesome Triton Resources☆18Updated 3 weeks ago
- Benchmarking PyTorch 2.0 different models☆21Updated last year
- ☆18Updated last month
- Exploring an idea where one forgets about efficiency and carries out attention across each edge of the nodes (tokens)☆43Updated last month
- Triton Implementation of HyperAttention Algorithm☆46Updated 10 months ago
- ☆25Updated 11 months ago
- Simple and efficient pytorch-native transformer training and inference (batched)☆61Updated 7 months ago
- Transformer with Mu-Parameterization, implemented in Jax/Flax. Supports FSDP on TPU pods.☆29Updated last week
- Effort to open-source 10.5 trillion parameter Gemini model.☆17Updated 11 months ago
- GoldFinch and other hybrid transformer components☆39Updated 3 months ago
- Proof-of-concept of global switching between numpy/jax/pytorch in a library.☆18Updated 4 months ago
- ☆13Updated 4 months ago