insoochung / transformer_bcq
BCQ tutorial for transformers
☆15Updated last year
Related projects: ⓘ
- ☆66Updated 3 months ago
- Model Stock: All we need is just a few fine-tuned models☆75Updated 5 months ago
- A block oriented training approach for inference time optimization.☆26Updated last month
- Awesome Triton Resources☆16Updated 3 weeks ago
- ☆42Updated 7 months ago
- [ACL 2023] Gradient Ascent Post-training Enhances Language Model Generalization☆27Updated last week
- Experiment of using Tangent to autodiff triton☆66Updated 7 months ago
- Yet another random morning idea to be quickly tried and architecture shared if it works; to allow the transformer to pause for any amount…☆50Updated 10 months ago
- Simple implementation of muP, based on Spectral Condition for Feature Learning. The implementation is SGD only, dont use it for Adam☆66Updated last month
- Some personal experiments around routing tokens to different autoregressive attention, akin to mixture-of-experts☆101Updated last year
- Training-free Post-training Efficient Sub-quadratic Complexity Attention. Implemented with OpenAI Triton.☆14Updated this week
- CUDA implementation of autoregressive linear attention, with all the latest research findings☆43Updated last year
- Demo of the unit_scaling library, showing how a model can be easily adapted to train in FP8.☆34Updated 2 months ago
- ☆73Updated 5 months ago
- ☆31Updated 8 months ago
- Implementation of Infini-Transformer in Pytorch☆100Updated last month
- ☆42Updated 3 months ago
- This is a simple torch implementation of the high performance Multi-Query Attention☆15Updated last year
- An unofficial implementation of SOLAR-10.7B model and the newly proposed interlocked-DUS(iDUS) implementation and experiment details.☆10Updated 6 months ago
- several types of attention modules written in PyTorch☆37Updated 4 months ago
- The source code of our work "Prepacking: A Simple Method for Fast Prefilling and Increased Throughput in Large Language Models"☆57Updated 5 months ago
- 삼각형의 실전! Triton☆14Updated 7 months ago
- ☆34Updated this week
- ☆35Updated 5 months ago
- [ACL 2024] LangBridge: Multilingual Reasoning Without Multilingual Supervision☆63Updated last week
- ☆45Updated 7 months ago
- A fast implementation of T5/UL2 in PyTorch using Flash Attention☆60Updated this week
- Here we will test various linear attention designs.☆55Updated 4 months ago
- some common Huggingface transformers in maximal update parametrization (µP)☆76Updated 2 years ago
- Triton Implementation of HyperAttention Algorithm☆46Updated 9 months ago