facebookresearch / GCD
Computing the greatest common divisor with transformers, source code for the paper https//arxiv.org/abs/2308.15594
☆12Updated 7 months ago
Related projects ⓘ
Alternatives and complementary repositories for GCD
- ☆18Updated 7 months ago
- Source-to-Source Debuggable Derivatives in Pure Python☆14Updated 9 months ago
- Benchmarking PyTorch 2.0 different models☆21Updated last year
- Code for the paper: https://arxiv.org/pdf/2309.06979.pdf☆16Updated 3 months ago
- FlexAttention w/ FlashAttention3 Support☆27Updated last month
- No-GIL Python environment featuring NVIDIA Deep Learning libraries.☆22Updated this week
- ☆15Updated last month
- Minimum Description Length probing for neural network representations☆16Updated last week
- ☆17Updated 3 weeks ago
- Mamba support for transformer lens☆12Updated 2 months ago
- Loop Nest - Linear algebra compiler and code generator.☆22Updated 2 years ago
- Heavyweight Python dynamic analysis framework☆13Updated 7 months ago
- Personal solutions to the Triton Puzzles☆16Updated 4 months ago
- A small python library to run iterators in a separate process☆10Updated 7 months ago
- Experiment of using Tangent to autodiff triton☆72Updated 10 months ago
- ☆25Updated last month
- Implementation of Hyena Hierarchy in JAX☆10Updated last year
- Awesome Triton Resources☆18Updated last month
- Make triton easier☆41Updated 5 months ago
- Repository of machine learning benchmarks☆32Updated this week
- Engineering the state of RNN language models (Mamba, RWKV, etc.)☆32Updated 5 months ago
- This repository contains the ToolSelect dataset which was used to fine-tune Llama-2 70B for tool selection.☆17Updated 8 months ago
- Proof-of-concept of global switching between numpy/jax/pytorch in a library.☆18Updated 5 months ago
- [ICML 24 NGSM workshop] Associative Recurrent Memory Transformer implementation and scripts for training and evaluating☆31Updated this week
- Embroid: Unsupervised Prediction Smoothing Can Improve Few-Shot Classification☆11Updated last year
- Reference implementation of "Softmax Attention with Constant Cost per Token" (Heinsen, 2024)☆24Updated 5 months ago
- ☆15Updated 10 months ago
- HyPe: Better Pre-trained Language Model Fine-tuning with Hidden Representation Perturbation [ACL 2023]☆14Updated last year