Official Implementation of "RTop-K: Ultra-Fast Row-Wise Top-K Selection for Neural Network Acceleration on GPUs"
☆29Jul 23, 2025Updated 8 months ago
Alternatives and similar repositories for RTopK
Users that are interested in RTopK are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆12Aug 22, 2023Updated 2 years ago
- Automatic ReLU Reduction☆15Dec 20, 2023Updated 2 years ago
- Sparse Backpropagation for Mixture-of-Expert Training☆29Jul 2, 2024Updated last year
- ☆48Jan 3, 2026Updated 2 months ago
- Ok-Topk is a scheme for distributed training with sparse gradients. Ok-Topk integrates a novel sparse allreduce algorithm (less than 6k c…☆27Dec 10, 2022Updated 3 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- Distributed SDDMM Kernel☆12Jul 8, 2022Updated 3 years ago
- Fun project to run your own LLM chat bot using llama.cpp☆11Jun 9, 2023Updated 2 years ago
- ☆158Dec 30, 2025Updated 2 months ago
- ☆13May 8, 2020Updated 5 years ago
- ☆12Aug 26, 2025Updated 7 months ago
- Debate interface, experiments, etc.☆10Mar 12, 2024Updated 2 years ago
- The official implementation of the paper "MLP Memory: A Retriever-Pretrained Memory for Large Language Models". (ICLR 2026)☆55Jan 28, 2026Updated 2 months ago
- Official codebase for NeurIPS 2022 paper End-to-end Learning to Index and Search in Large Output Spaces☆12Apr 19, 2023Updated 2 years ago
- A synthetic graph generator on spark for the LDBC Financial Benchmark, featured as temporal graph☆14Jan 9, 2026Updated 2 months ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Mamba support for transformer lens☆19Sep 17, 2024Updated last year
- MLPerf Mobile benchmarks☆16Jan 27, 2026Updated 2 months ago
- Proof of concept implementation of Sigmabus https://eprint.iacr.org/2023/1406☆10Dec 20, 2023Updated 2 years ago
- A tiny easily hackable implementation of a feature dashboard.☆16Oct 21, 2025Updated 5 months ago
- Connect a Ublox NEO-6M/NE0-M8N gps module to a WiPy2.0/3.0☆10Apr 29, 2018Updated 7 years ago
- Some utility functions to help myself (and perhaps others) go faster with ML/AI work☆45Feb 11, 2026Updated last month
- A repo for learning how to parallelize computations in the GPU using Apple's Metal, in Rust.☆10Mar 17, 2023Updated 3 years ago
- Compare Bloxroute and Fiber transaction streams☆10Nov 22, 2024Updated last year
- A test library for computing modular exponentiation in parallel using AVX-512 vector arithmetic☆12Dec 18, 2023Updated 2 years ago
- Wordpress hosting with auto-scaling on Cloudways • AdFully Managed hosting built for WordPress-powered businesses that need reliable, auto-scalable hosting. Cloudways SafeUpdates now available.
- Tendermint implementation of the blockchain of Aleo verifiable computing model built by LambdaClass☆15Feb 8, 2023Updated 3 years ago
- This is the official repository for "Explanatory Learning: Beyond Empiricism in Neural Networks".☆15May 17, 2022Updated 3 years ago
- Minimum Description Length probing for neural network representations☆20Jan 28, 2025Updated last year
- Mapping out the "memory" of neural nets with data attribution☆50Updated this week
- The entry point for Rust projects to be run on Valida☆10Mar 14, 2025Updated last year
- LLM4HWDesign Starting Toolkit☆19Oct 4, 2024Updated last year
- Scalable radix top-k selection on GPUs.☆21Jan 27, 2025Updated last year
- ☆14Mar 1, 2021Updated 5 years ago
- Gamora: Graph Learning based Symbolic Reasoning for Large-Scale Boolean Networks (DAC'23)☆58Jan 8, 2025Updated last year
- Wordpress hosting with auto-scaling on Cloudways • AdFully Managed hosting built for WordPress-powered businesses that need reliable, auto-scalable hosting. Cloudways SafeUpdates now available.
- [MLSys 2022] "BNS-GCN: Efficient Full-Graph Training of Graph Convolutional Networks with Partition-Parallelism and Random Boundary Node …☆56Oct 6, 2023Updated 2 years ago
- Generalized Optimal Transport Attention with Trainable Priors☆26Jan 25, 2026Updated 2 months ago
- ☆12Jun 5, 2025Updated 9 months ago
- 🦎 Prototypes on polymorphic, metamorphic and poly-metamorphic malwares in Rust 🦎☆14Oct 8, 2023Updated 2 years ago
- Magicube is a high-performance library for quantized sparse matrix operations (SpMM and SDDMM) of deep learning on Tensor Cores.☆92Nov 23, 2022Updated 3 years ago
- New batched algorithm for sparse matrix-matrix multiplication (SpMM)☆16May 7, 2019Updated 6 years ago
- Composable numerical solvers for unconstrained and simple-bounds constrained convex optimization problems in Rust. WASM compatible☆14Jul 10, 2025Updated 8 months ago