from MHA, MQA, GQA to MLA by 苏剑林, with code
☆47Feb 19, 2025Updated last year
Alternatives and similar repositories for MLA_tutorial
Users that are interested in MLA_tutorial are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆23Aug 20, 2025Updated 8 months ago
- ☆14Apr 19, 2024Updated 2 years ago
- A curated list of open-source projects that help leverage CXL technology.☆28Sep 26, 2024Updated last year
- Official Repo For AAAI 2026 Accepted Paper "Rethinking the Spatio-Temporal Alignment of End-to-End 3D Perception"☆30Mar 25, 2026Updated last month
- ☆13Jul 2, 2021Updated 4 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Official repository Flash Local Linear Attention☆23Apr 23, 2026Updated last week
- Implement Flash Attention using Cute.☆106Dec 17, 2024Updated last year
- NetLogo models developed in the book "Agent-Based Evolutionary Game Dynamics"☆10Feb 19, 2026Updated 2 months ago
- Implement custom operators in PyTorch with cuda/c++☆77Jan 1, 2023Updated 3 years ago
- 中华药典RAG项目☆10Oct 26, 2024Updated last year
- aigc evals☆10Dec 2, 2023Updated 2 years ago
- Code for the paper "Cottention: Linear Transformers With Cosine Attention"☆20Nov 15, 2025Updated 5 months ago
- Materials associated with the Agent-based Modelling training series☆11Mar 18, 2022Updated 4 years ago
- A lightweight, production-ready C++ library for LLM tokenization, fully compatible with HuggingFace tokenizer.json.☆28Jan 4, 2026Updated 3 months ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- CenterPoint model trained with MMDetection3d on custom dataset, and deployed with TensorRT☆35Mar 15, 2023Updated 3 years ago
- Contextual Position Encoding but with some custom CUDA Kernels https://arxiv.org/abs/2405.18719☆22Jun 5, 2024Updated last year
- python igraph tutorial☆11Nov 23, 2023Updated 2 years ago
- ☆101Feb 11, 2026Updated 2 months ago
- Your finetuned model's back to its original safety standards faster than you can say "SafetyLock"!☆11Oct 16, 2024Updated last year
- [EMNLP 2024 Tutorial] Language Agents: Foundations, Prospects, and Risks☆10Nov 27, 2024Updated last year
- Using ChatGPT to select interesting arXiv papers☆17Aug 19, 2025Updated 8 months ago
- This repository contains the geatpy implementation for paper: Co-operative Prediction Strategy for Solving Dynamic Multi-Objective Optimi…☆10Sep 30, 2020Updated 5 years ago
- The repo for paper: Exploiting the Index Gradients for Optimization-Based Jailbreaking on Large Language Models.☆14Dec 16, 2024Updated last year
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- Official repository for the paper "Neural Differential Equations for Learning to Program Neural Nets Through Continuous Learning Rules" (…☆23Jun 11, 2025Updated 10 months ago
- ☆13May 12, 2025Updated 11 months ago
- ☆44Apr 9, 2024Updated 2 years ago
- ☆20Apr 14, 2026Updated 2 weeks ago
- Empower your real estate decisions with our data-driven model, delivering precise rental predictions for landlords and comprehensive insi…☆13Apr 26, 2025Updated last year
- a fast and customizable CUDA int4 tensor core gemm☆15Aug 2, 2024Updated last year
- ☆23May 4, 2020Updated 5 years ago
- Paper list for Modern Hopfield Networks☆27Mar 7, 2026Updated last month
- A method to automatically calibrate lidar and camera☆21Jun 11, 2024Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Paxplot is a Python visualization library for parallel axis, or parallel coordinate, plots.☆10Sep 29, 2025Updated 7 months ago
- The official implement of paper S2-VER: Semi-Supervised Visual Emotion Recognition☆11Apr 28, 2024Updated 2 years ago
- **ASCM4ABSA** - Our code and proposed data for NLPCC 2022 paper titled "Aspect-specific Context Modeling for Aspect-based Sentiment Analy…☆12Mar 26, 2023Updated 3 years ago
- 一站式爬取多个平台的数据,可自动清洗整理成需要的格式 目前支持的平台:微博 后期新增:微信,知乎,雪球,小红书 等☆12Nov 24, 2023Updated 2 years ago
- 保研/求职latex简历模版☆27Mar 23, 2025Updated last year
- [CVPR 2026] Layer-wise Scale Alignment for Training-Free Streaming 4D Reconstruction☆60Mar 18, 2026Updated last month
- Official repository for the paper "Gradient-based Jailbreak Images for Multimodal Fusion Models" (https//arxiv.org/abs/2410.03489)☆19Oct 22, 2024Updated last year