from MHA, MQA, GQA to MLA by 苏剑林, with code
☆43Feb 19, 2025Updated last year
Alternatives and similar repositories for MLA_tutorial
Users that are interested in MLA_tutorial are comparing it to the libraries listed below
Sorting:
- ☆19Aug 20, 2025Updated 6 months ago
- ☆31Aug 25, 2023Updated 2 years ago
- DsNet: A Novel Hybrid Architecture of Convolution and Transformer for Real-time Weld Seam Image Segmentation☆13Sep 1, 2024Updated last year
- 我的博客☆11Updated this week
- Official Repo For AAAI 2026 Accepted Paper "Rethinking the Spatio-Temporal Alignment of End-to-End 3D Perception"☆28Jan 13, 2026Updated last month
- ☆39Apr 9, 2024Updated last year
- ☆36Aug 25, 2023Updated 2 years ago
- CenterPoint model trained with MMDetection3d on custom dataset, and deployed with TensorRT☆35Mar 15, 2023Updated 2 years ago
- Implement Flash Attention using Cute.☆101Dec 17, 2024Updated last year
- ☆18Feb 13, 2026Updated 2 weeks ago
- Official repository for the paper Local Linear Attention: An Optimal Interpolation of Linear and Softmax Attention For Test-Time Regressi…☆23Oct 1, 2025Updated 4 months ago
- ☆33Dec 10, 2025Updated 2 months ago
- The official github repo for the open online courses: "Dive into LLMs".☆10Mar 15, 2024Updated last year
- ☆16Sep 18, 2025Updated 5 months ago
- ☆24Nov 21, 2025Updated 3 months ago
- Math24o: 高中奥林匹克数学竞赛测评集 High School Olympiad Mathematics Chinese Benchmark☆11Mar 27, 2025Updated 11 months ago
- 跟着Tensorrt_pro学习各种知识☆40Nov 25, 2022Updated 3 years ago
- A lightweight, production-ready C++ library for LLM tokenization, fully compatible with HuggingFace tokenizer.json.☆24Jan 4, 2026Updated last month
- ☆16Apr 1, 2025Updated 11 months ago
- The repo for paper: Exploiting the Index Gradients for Optimization-Based Jailbreaking on Large Language Models.☆13Dec 16, 2024Updated last year
- a fast and customizable CUDA int4 tensor core gemm☆15Aug 2, 2024Updated last year
- Easy Dataset Docs☆14Jan 21, 2026Updated last month
- ☆11Aug 27, 2024Updated last year
- Persistent dense gemm for Hopper in `CuTeDSL`☆15Aug 9, 2025Updated 6 months ago
- ☆11Apr 26, 2025Updated 10 months ago
- ☆59Mar 8, 2025Updated 11 months ago
- CAD - Memory Efficient Convolutional Adapter for Segment Anything☆12Oct 4, 2024Updated last year
- ☆13May 12, 2025Updated 9 months ago
- ☆13Feb 6, 2025Updated last year
- SeCap: Self-Calibrating and Adaptive Prompts for Cross-view Person Re-Identification in Aerial-Ground Networks (CVPR'25)☆19Jul 1, 2025Updated 8 months ago
- Quantize yolov7 using pytorch_quantization.🚀🚀🚀☆12Oct 20, 2023Updated 2 years ago
- Official repository of the paper "Explainable Deep Learning Methods in Medical Image Classification: A Survey", ACM Computing Surveys (CS…☆10Jan 9, 2024Updated 2 years ago
- simplest online-softmax notebook for explain Flash Attention☆16Jan 27, 2026Updated last month
- ☆10Jan 4, 2017Updated 9 years ago
- mixedbread ai python sdk☆12Jul 1, 2024Updated last year
- [ACL 2025] LongSafety: Evaluating Long-Context Safety of Large Language Models☆15Jun 18, 2025Updated 8 months ago
- 💻NUAA 2018 操作系统小作业-模拟内存分配程序(BF算法)☆13Jul 2, 2018Updated 7 years ago
- LLMTechSite, 专注于通用人工智能领域的技术生态。☆12Jan 23, 2026Updated last month
- Experiments with representation engineering☆13Feb 28, 2024Updated 2 years ago