☆20Aug 20, 2025Updated 6 months ago
Alternatives and similar repositories for algebraic-layouts
Users that are interested in algebraic-layouts are comparing it to the libraries listed below
Sorting:
- from MHA, MQA, GQA to MLA by 苏剑林, with code☆43Feb 19, 2025Updated last year
- A practical way of learning Swizzle☆37Feb 3, 2025Updated last year
- ☆31Aug 25, 2023Updated 2 years ago
- ☆52Jan 5, 2026Updated 2 months ago
- Official Repo For AAAI 2026 Accepted Paper "Rethinking the Spatio-Temporal Alignment of End-to-End 3D Perception"☆29Jan 13, 2026Updated last month
- From Minimal GEMM to Everything☆183Feb 10, 2026Updated 3 weeks ago
- A collection of specialized agent skills for AI infrastructure development, enabling Claude Code to write, optimize, and debug high-perfo…☆88Feb 2, 2026Updated last month
- ☆39Apr 9, 2024Updated last year
- ☆36Aug 25, 2023Updated 2 years ago
- CenterPoint model trained with MMDetection3d on custom dataset, and deployed with TensorRT☆35Mar 15, 2023Updated 2 years ago
- 🌈 Solutions of LeetGPU☆75Mar 1, 2026Updated last week
- ☆13Sep 5, 2024Updated last year
- A simple script to plot the Roofline model for given HW platforms and applications☆10Aug 22, 2024Updated last year
- ☆33Dec 10, 2025Updated 2 months ago
- Official repository for the paper Local Linear Attention: An Optimal Interpolation of Linear and Softmax Attention For Test-Time Regressi…☆23Oct 1, 2025Updated 5 months ago
- ☆116May 16, 2025Updated 9 months ago
- ☆12Sep 12, 2024Updated last year
- Compression primitives for uplink compression in Federated Learning that are compatible with Secure Aggregation.☆10Jul 27, 2022Updated 3 years ago
- a fast and customizable CUDA int4 tensor core gemm☆15Aug 2, 2024Updated last year
- ☆13Dec 9, 2024Updated last year
- ☆14Updated this week
- 搜户引擎,搜索户晨风高论,为清洗数据,训练户子AI做准备工作☆18Oct 25, 2025Updated 4 months ago
- 用于导入flomo到思源☆15Dec 14, 2023Updated 2 years ago
- Multi-Level Triton Runner supporting Python, IR, PTX, and cubin.☆84Feb 26, 2026Updated last week
- ☆10Mar 3, 2021Updated 5 years ago
- 在本地愉快写 BUAA OS Lab,并直接在本地使用 git 提交。☆10Jun 2, 2021Updated 4 years ago
- Cute layout visualization☆31Jan 18, 2026Updated last month
- ICRA 2026: ORAD-3D, a large-scale off-road autonomous driving dataset. Tasks: 2D free-space detection, 3D occupancy prediction, rough GPS…☆34Feb 13, 2026Updated 3 weeks ago
- Testing a similar effect to Superliminal☆12Nov 27, 2019Updated 6 years ago
- ☆14Nov 17, 2021Updated 4 years ago
- Examples of CUDA implementations by Cutlass CuTe☆272Jul 1, 2025Updated 8 months ago
- This is an implementation of Multi-Object Tracking based on YOLOv3.☆14Dec 30, 2019Updated 6 years ago
- A method to automatically calibrate lidar and camera☆20Jun 11, 2024Updated last year
- h264的软解和硬解,基于FFmpeg和MPP☆11Mar 23, 2022Updated 3 years ago
- A direct convolution library targeting ARM multi-core CPUs.☆12Nov 27, 2024Updated last year
- Basic delay-and-sum beamforming routines for demonstration purposes☆15Sep 20, 2024Updated last year
- 使用腾讯云Serverless解析城通网盘直连地址☆13Oct 5, 2022Updated 3 years ago
- Code for the paper "Deep Attention Based Semi-Supervised 2D-Pose Estimation for Surgical Instruments"☆12Dec 14, 2019Updated 6 years ago
- Accelerating MoE with IO and Tile-aware Optimizations☆597Feb 27, 2026Updated last week