Official Implementation of "RTop-K: Ultra-Fast Row-Wise Top-K Selection for Neural Network Acceleration on GPUs"
☆28Jul 23, 2025Updated 8 months ago
Alternatives and similar repositories for RTopK
Users that are interested in RTopK are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆12Aug 22, 2023Updated 2 years ago
- Official Implementation of "LinGCN: Structural Linearized Graph Convolutional Network for Homomorphically Encrypted Inference"☆25Nov 12, 2023Updated 2 years ago
- Automatic ReLU Reduction☆15Dec 20, 2023Updated 2 years ago
- Official Implementation of "Accel-GNN: High-Performance GPU Accelerator Design for Graph Neural Networks"☆52Mar 20, 2025Updated last year
- This repo contains the dataset for paper: Creating a Dataset Supporting Translation Between OpenMP Fortran and C++ Code☆15Dec 1, 2023Updated 2 years ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- Sparse Backpropagation for Mixture-of-Expert Training☆30Jul 2, 2024Updated last year
- ☆48Jan 3, 2026Updated 3 months ago
- ☆11Apr 27, 2013Updated 12 years ago
- Ok-Topk is a scheme for distributed training with sparse gradients. Ok-Topk integrates a novel sparse allreduce algorithm (less than 6k c…☆27Dec 10, 2022Updated 3 years ago
- Distributed SDDMM Kernel☆12Jul 8, 2022Updated 3 years ago
- Fun project to run your own LLM chat bot using llama.cpp☆11Jun 9, 2023Updated 2 years ago
- A Vector Caching Scheme for Streaming FPGA SpMV Accelerators☆10Sep 7, 2015Updated 10 years ago
- A toy Inspect implementation of the Bliss Attractor eval from Claude 4 System Card Welfare Assessment☆38Jun 5, 2025Updated 10 months ago
- [NeurIPS 2025 MechInterp Workshop - Spotlight] Official implementation of the paper "RelP: Faithful and Efficient Circuit Discovery in La…☆27Nov 3, 2025Updated 5 months ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Matlab mex wrappers to cuSPARSE (NVIDIA)☆11Dec 10, 2025Updated 4 months ago
- ☆12Aug 26, 2025Updated 7 months ago
- ICML 2025 Papers: Dive into cutting-edge research from the premier machine learning conference. Stay current with breakthroughs in deep l…☆37Oct 24, 2025Updated 5 months ago
- A synthetic graph generator on spark for the LDBC Financial Benchmark, featured as temporal graph☆14Apr 12, 2026Updated last week
- Mamba support for transformer lens☆19Sep 17, 2024Updated last year
- MLPerf Mobile benchmarks☆15Apr 14, 2026Updated last week
- ☆13Apr 3, 2024Updated 2 years ago
- The official implementation of the paper "MLP Memory: A Retriever-Pretrained Memory for Large Language Models". (ICLR 2026)☆58Jan 28, 2026Updated 2 months ago
- Proof of concept implementation of Sigmabus https://eprint.iacr.org/2023/1406☆10Dec 20, 2023Updated 2 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- A tiny easily hackable implementation of a feature dashboard.☆16Oct 21, 2025Updated 6 months ago
- A test library for computing modular exponentiation in parallel using AVX-512 vector arithmetic☆12Dec 18, 2023Updated 2 years ago
- A repo for learning how to parallelize computations in the GPU using Apple's Metal, in Rust.☆10Mar 17, 2023Updated 3 years ago
- Compare Bloxroute and Fiber transaction streams☆10Nov 22, 2024Updated last year
- Tendermint implementation of the blockchain of Aleo verifiable computing model built by LambdaClass☆15Feb 8, 2023Updated 3 years ago
- This is the official repository for "Explanatory Learning: Beyond Empiricism in Neural Networks".☆15May 17, 2022Updated 3 years ago
- Minimum Description Length probing for neural network representations☆20Jan 28, 2025Updated last year
- ☆24Jan 28, 2025Updated last year
- LLM4HWDesign Starting Toolkit☆19Oct 4, 2024Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- The entry point for Rust projects to be run on Valida☆10Mar 14, 2025Updated last year
- Mapping out the "memory" of neural nets with data attribution☆53Updated this week
- ☆14Mar 1, 2021Updated 5 years ago
- Gamora: Graph Learning based Symbolic Reasoning for Large-Scale Boolean Networks (DAC'23)☆58Jan 8, 2025Updated last year
- ☆12Jun 5, 2025Updated 10 months ago
- 🦎 Prototypes on polymorphic, metamorphic and poly-metamorphic malwares in Rust 🦎☆14Oct 8, 2023Updated 2 years ago
- Magicube is a high-performance library for quantized sparse matrix operations (SpMM and SDDMM) of deep learning on Tensor Cores.☆92Nov 23, 2022Updated 3 years ago