from MHA, MQA, GQA to MLA by 苏剑林, with code
☆49Feb 19, 2025Updated last year
Alternatives and similar repositories for MLA_tutorial
Users that are interested in MLA_tutorial are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆23Aug 20, 2025Updated 10 months ago
- Official repository Flash Local Linear Attention☆37May 28, 2026Updated last month
- Selective Copying Task with Mamba Model. This repository contains a simple implementation for reproducing the selective copying task with…☆14Jun 3, 2024Updated 2 years ago
- Implement Flash Attention using Cute.☆108Dec 17, 2024Updated last year
- Math24o: 高中奥林匹克数学竞赛测评集 High School Olympiad Mathematics Chinese Benchmark☆13Mar 27, 2025Updated last year
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- Official Repo For AAAI 2026 Accepted Paper "Rethinking the Spatio-Temporal Alignment of End-to-End 3D Perception"☆32Mar 25, 2026Updated 3 months ago
- Rank-DistiLLM: Closing the Effectiveness Gap Between Cross-Encoders and LLMs for Passage Re-Ranking☆25Apr 4, 2025Updated last year
- ☆32Aug 25, 2023Updated 2 years ago
- Implement custom operators in PyTorch with cuda/c++☆77Jan 1, 2023Updated 3 years ago
- Modeling methods of System Dynamics – Supply Chain Simulation using the Anylogic software☆10Jan 8, 2026Updated 5 months ago
- Materials associated with the Agent-based Modelling training series☆11Mar 18, 2022Updated 4 years ago
- ☆58Jan 5, 2026Updated 5 months ago
- ASR project with pytorch-lightning☆20Mar 21, 2025Updated last year
- Cross Visual Prompt Tuning [ICCV 2025]☆13Aug 3, 2025Updated 11 months ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- ☆37Aug 25, 2023Updated 2 years ago
- CenterPoint model trained with MMDetection3d on custom dataset, and deployed with TensorRT☆35Mar 15, 2023Updated 3 years ago
- The official github repo for the open online courses: "Dive into LLMs".☆10Mar 15, 2024Updated 2 years ago
- python igraph tutorial☆11Nov 23, 2023Updated 2 years ago
- ☆102Feb 11, 2026Updated 4 months ago
- 桂林电子科技大学PPT模板(Beamer)☆13Dec 22, 2023Updated 2 years ago
- 跟着Tensorrt_pro学习各种知识☆39Nov 25, 2022Updated 3 years ago
- [EMNLP 2024 Tutorial] Language Agents: Foundations, Prospects, and Risks☆10Nov 27, 2024Updated last year
- Using ChatGPT to select interesting arXiv papers☆17Aug 19, 2025Updated 10 months ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- The repo for paper: Exploiting the Index Gradients for Optimization-Based Jailbreaking on Large Language Models.☆15Dec 16, 2024Updated last year
- ☆21May 26, 2026Updated last month
- ☆45Apr 9, 2024Updated 2 years ago
- Official implementation of paper "VLM³: Vision Language Models Are Native 3D Learners".☆360Jun 1, 2026Updated last month
- ☆18Feb 24, 2025Updated last year
- 百度人体分析Demo:人体关键点、人体属性、手势识别、人像分割、人流量统计、驾驶行为分析(邀测)、人流量统计动态版(邀测)☆15Nov 29, 2018Updated 7 years ago
- a fast and customizable CUDA int4 tensor core gemm☆15Aug 2, 2024Updated last year
- Paper list for Modern Hopfield Networks☆27Mar 7, 2026Updated 3 months ago
- ☆12Sep 23, 2024Updated last year
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- Paxplot is a Python visualization library for parallel axis, or parallel coordinate, plots.☆10Sep 29, 2025Updated 9 months ago
- The official implement of paper S2-VER: Semi-Supervised Visual Emotion Recognition☆11Apr 28, 2024Updated 2 years ago
- DELTA: Decomposed Efficient Long-Term Robot Task Planning using Large Language Models☆27Apr 16, 2025Updated last year
- 一站式爬取多个平台的数据,可自动清洗整理成需要的格式 目前支持的平台:微博 后期新增:微信,知乎,雪球,小红书 等☆12Nov 24, 2023Updated 2 years ago
- 保研/求职latex简历模版☆38Mar 23, 2025Updated last year
- Course project for CS230. Implemented using PyTorch.☆16Dec 17, 2018Updated 7 years ago
- ☆14May 12, 2025Updated last year