Effective transpose on Hopper GPU
☆28Sep 6, 2025Updated 5 months ago
Alternatives and similar repositories for effective_transpose
Users that are interested in effective_transpose are comparing it to the libraries listed below
Sorting:
- ☆12Nov 5, 2024Updated last year
- Implement Flash Attention using Cute.☆101Dec 17, 2024Updated last year
- Official format for time series data captured from 3D Engines.☆12May 14, 2023Updated 2 years ago
- A tool for coordinated checkpoint/restore of distributed applications with CRIU☆31Feb 15, 2026Updated 2 weeks ago
- Flash Attention in 300-500 lines of CUDA/C++☆36Aug 22, 2025Updated 6 months ago
- DeeperGEMM: crazy optimized version☆74May 5, 2025Updated 9 months ago
- Core OpenEP code - Matlab implementation☆11Jan 25, 2026Updated last month
- Accelerate LLM preference tuning via prefix sharing with a single line of code☆51Jul 4, 2025Updated 7 months ago
- Official repo for paper "HiMoE-VLA: Hierarchical Mixture-of-Experts for Generalist Vision-Language-Action Policies"☆23Dec 12, 2025Updated 2 months ago
- Tools for reading OpenStreetMap (OSM) data and gradually turning it into routable networks.☆13Jul 2, 2015Updated 10 years ago
- Fast, efficient, private cloud store☆10Apr 6, 2017Updated 8 years ago
- Decentralized kv storage engine,support decentralized P2P networking, data synchronization and consistency between nodes.☆14Jan 4, 2026Updated last month
- 面向多平台编译优化的深度学习中间表示☆10Oct 28, 2024Updated last year
- Implementation of the Idemix attribute based credential scheme used in IRMA☆11Dec 11, 2024Updated last year
- Triton-based Symmetric Memory operators and examples☆85Jan 15, 2026Updated last month
- access ChatGPT/Gemini/Claude from Emacs without APIs☆10Dec 25, 2025Updated 2 months ago
- Fastest kernels written from scratch☆548Sep 18, 2025Updated 5 months ago
- rdma新手优化教程,基于verbs和rdmacm,用于高性能计算与分离式内存系统☆15Sep 30, 2024Updated last year
- Evergreen front-end☆12Apr 5, 2024Updated last year
- Benchmark scripts for comparing tutorials in PyTorch and JAX☆14Aug 25, 2022Updated 3 years ago
- Abstraction over Lua coroutines to nicely build and synchronize asynchronous operations. Renamed and moved to https://imagicthecat.thul.f…☆12May 18, 2023Updated 2 years ago
- Some useful lua libraries for dota 2 addons☆10Nov 3, 2015Updated 10 years ago
- CS169.1x Software as a Service course offered by UC Berkeley at edx.org☆14Oct 28, 2014Updated 11 years ago
- Implementation of SelfExtend from the paper "LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning" from Pytorch and Zeta☆13Nov 11, 2024Updated last year
- QQ 群验证机器人☆10Nov 9, 2021Updated 4 years ago
- Connected Papers knockoff, managing academic papers and citations with graph database.☆12Dec 26, 2023Updated 2 years ago
- Functionality for modifying Julia package registry files☆12Updated this week
- UDP-only netcat implementation with OCaml / MirageOS☆14Mar 21, 2017Updated 8 years ago
- Minimal Transformer base in JAX. A single backbone for language modelling, diffusion, classification, etc...☆13May 28, 2025Updated 9 months ago
- LaunchPad is a light-weighted Slurm job launcher designed for hyper-parameter search.☆11Aug 2, 2024Updated last year
- PyTorch Lightning based framework to run experiments for self-supervised learning tasks.☆10Feb 14, 2020Updated 6 years ago
- ☆10May 1, 2023Updated 2 years ago
- ☆11Dec 22, 2024Updated last year
- convert lua data to string.☆11Jun 8, 2025Updated 8 months ago
- Simple language collation for Go☆13Nov 3, 2020Updated 5 years ago
- Object oriented RDF in Ruby☆58Jun 13, 2012Updated 13 years ago
- Bloom filters in Julia☆18Jul 11, 2019Updated 6 years ago
- C++ "borrowing" smart pointer.☆11May 13, 2022Updated 3 years ago
- vmnet based gvisor tcpip stack☆12Jan 22, 2024Updated 2 years ago