A basic pure pytorch implementation of flash attention
☆16Oct 28, 2024Updated last year
Alternatives and similar repositories for flash_attention
Users that are interested in flash_attention are comparing it to the libraries listed below
Sorting:
- Legible, Scalable, Reproducible Foundation Models with Named Tensors and Jax☆16Jun 16, 2024Updated last year
- new optimizer☆20Aug 4, 2024Updated last year
- Experimental GPU language with meta-programming☆26Sep 6, 2024Updated last year
- ☆34May 14, 2025Updated 9 months ago
- Repository for Sparse Finetuning of LLMs via modified version of the MosaicML llmfoundry☆42Jan 15, 2024Updated 2 years ago
- Object recognition with Pepper using a deep learning model☆10Sep 16, 2021Updated 4 years ago
- ☆10Nov 15, 2023Updated 2 years ago
- Extensive time series analysis of chinese PM2.5 content, using models from ARMA and VAR to LSTMs and dynamic time warping clustering☆11Aug 17, 2019Updated 6 years ago
- ☆21Jan 26, 2026Updated last month
- Official implementation of the paper "Pretraining Language Models to Ponder in Continuous Space"☆25Jul 21, 2025Updated 7 months ago
- A curated list of awesome Inverse Reinforcement Learning resources.☆41Feb 3, 2022Updated 4 years ago
- https://youtu.be/pE7UOYioPKk☆10Feb 16, 2023Updated 3 years ago
- 一个基于原生浏览器书签的知识库:用 GitHub Gist 跨浏览器同步书签,并用 AI 为书签生成摘要、标签和封面,提供一个简洁的 Web 端浏览体验。☆30Jan 5, 2026Updated last month
- Baseline system for Language-based Audio Retrieval (Task 6B) in DCASE 2023 Challenge☆10Aug 8, 2023Updated 2 years ago
- ☆11Dec 22, 2024Updated last year
- Sample Flask Application to demonstrate OpenTelemetry instrumentation☆10Nov 27, 2025Updated 3 months ago
- ☆13May 7, 2023Updated 2 years ago
- Unofficial implementation for Sigmoid Loss for Language Image Pre-Training☆11Sep 26, 2023Updated 2 years ago
- ☆11Dec 9, 2025Updated 2 months ago
- Experiment utility code, specifically designed for use with Compute Canada.☆11Jan 27, 2025Updated last year
- Construction of control systems from predictive models via Quantization, Simulation, Modeling, Optimization☆12Sep 15, 2023Updated 2 years ago
- LLM-based character segmentation agent for ComfyUI based on SAM 3 and the SAM 3 Agent notebook☆25Dec 22, 2025Updated 2 months ago
- Code Release for ICML 2024: MS-TIP - Imputation Aware Pedestrian Trajectory Prediction☆19Apr 17, 2025Updated 10 months ago
- Cuda extensions for PyTorch☆12Dec 2, 2025Updated 2 months ago
- ☆13Jun 18, 2024Updated last year
- Course repository for the Spring 2023 COMP664 course "Deep Learning" at UNC☆14Apr 17, 2023Updated 2 years ago
- [SIGGRAPH Asia 2025] "ASIA: Adaptive 3D Segmentation using Few Image Annotations ".☆23Feb 14, 2026Updated 2 weeks ago
- PyTorch - Albert Large V2, Bert Base Uncased, Bert Large Uncased WWM Finetuned Squad, Distil Roberta Base, Roberta Base Squad2, Roberta l…☆11Jul 10, 2020Updated 5 years ago
- BFloat16 Fused Adam Operator for PyTorch☆16Nov 16, 2024Updated last year
- Optimized primitives for collective multi-GPU communication☆10May 8, 2024Updated last year
- ☆20Oct 4, 2024Updated last year
- ☆10Oct 17, 2021Updated 4 years ago
- Demonstration that finetuning RoPE model on larger sequences than the pre-trained model adapts the model context limit☆63Jun 21, 2023Updated 2 years ago
- ☆51Jan 28, 2024Updated 2 years ago
- Official implementation for RoMaP :Robust 3D-Masked Part-level Editing in 3D Gaussian Splatting with Regularized Score Distillation Sampl…☆21Aug 5, 2025Updated 6 months ago
- Vlaser: Vision-Language-Action Model with Synergistic Embodied Reasoning☆41Updated this week
- torchvision-based transforms that provide access to parameterization☆16Dec 4, 2025Updated 2 months ago
- ☆12Jan 4, 2024Updated 2 years ago
- The code for the paper, 'Meta-Curvature, Eunbyung Park and Junier Oliver, NeurIPS 2019'☆11Jan 20, 2020Updated 6 years ago