Parallel Prefix Sum (Scan) with CUDA.
☆15Jul 17, 2020Updated 5 years ago
Alternatives and similar repositories for CUDA-Parallel-Prefix-Sum
Users that are interested in CUDA-Parallel-Prefix-Sum are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A pytorch implementation of focal loss☆11Oct 13, 2023Updated 2 years ago
- This repository is outdated and the related functionality has been migrated to https://github.com/easysoc/easysoc-firrtl☆11Nov 3, 2021Updated 4 years ago
- Code for the paper: https://arxiv.org/pdf/2309.06979.pdf☆21Jul 29, 2024Updated last year
- Procyon is the brightest star in the constellation of Canis Minor. But it's also the name of my RISC-V out-of-order processor.☆12Apr 6, 2023Updated 3 years ago
- Dynamic Hashed Blocks (DHB) data structure for dynamic graphs☆12Sep 8, 2025Updated 9 months ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- ☆10Mar 24, 2023Updated 3 years ago
- Extending the Neural Graph Algorithm Executor☆13Dec 8, 2022Updated 3 years ago
- GPU for OENG1167 in Verilog HDL for DE10 series boards☆15Nov 1, 2020Updated 5 years ago
- QuteRTL: A RTL Front-End Towards Intelligent Synthesis and Verification☆16Nov 8, 2016Updated 9 years ago
- ☆16Apr 30, 2021Updated 5 years ago
- Findings of ACL 2021☆24May 8, 2021Updated 5 years ago
- Gaussian Splatting implementation based on gsplat. Easy to install and use.☆26Mar 8, 2025Updated last year
- Cython implementation of Moattar and Homayounpour's Voice Activity Detection (VAD) algorithm fast enough for real-time on an RPi 3.☆12Aug 18, 2018Updated 7 years ago
- CUDA implementation of exclusive prefix sum via Blelloch's algorithm☆29Jul 19, 2017Updated 8 years ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- Embroid: Unsupervised Prediction Smoothing Can Improve Few-Shot Classification☆11Aug 12, 2023Updated 2 years ago
- Code for the paper "Stack Attention: Improving the Ability of Transformers to Model Hierarchical Patterns"☆18Mar 15, 2024Updated 2 years ago
- Diffusion Monte Carlo method☆12Nov 2, 2018Updated 7 years ago
- Verilog AST☆21Dec 2, 2023Updated 2 years ago
- ☆11Oct 11, 2023Updated 2 years ago
- A method for evaluating the high-level coherence of machine-generated texts. Identifies high-level coherence issues in transformer-based …☆12Mar 18, 2023Updated 3 years ago
- Cortex-M0 DesignStart Wrapper☆24Aug 11, 2019Updated 6 years ago
- ☆10Jun 17, 2020Updated 5 years ago
- Fine-Tuning Pre-trained Transformers into Decaying Fast Weights☆19Oct 9, 2022Updated 3 years ago
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- Statistical discontinuous constituent parsing☆11Feb 15, 2018Updated 8 years ago
- Code for "Deep Energy-Based Modeling of Discrete-Time Physics," NeurIPS, 2020. (Oral)☆19Jan 30, 2022Updated 4 years ago
- Monocular Depth Estimation using Atrous Convolutions☆11Apr 5, 2019Updated 7 years ago
- JAX/Flax implementation of the Hyena Hierarchy☆35Apr 27, 2023Updated 3 years ago
- ☆14Jul 23, 2025Updated 10 months ago
- Unofficial implementation of paper : Exploring the Space of Key-Value-Query Models with Intention☆12May 24, 2023Updated 3 years ago
- A neural network for filtering target speaker's voice from audio written in tensorflow