A 128 bit unsigned integer class for CUDA
☆46Jan 3, 2025Updated last year
Alternatives and similar repositories for CUDA-uint128
Users that are interested in CUDA-uint128 are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- CGBN: CUDA Accelerated Multiple Precision Arithmetic (Big Num) using Cooperative Groups☆241Mar 31, 2026Updated 2 weeks ago
- A GPU accelerated implementation of the sieve of Eratosthenes☆66Dec 18, 2022Updated 3 years ago
- Launching collective tasks in bulk☆37Oct 4, 2019Updated 6 years ago
- A library to benchmark CUDA code, similar to google benchmark.☆31Apr 18, 2021Updated 5 years ago
- CUDA accelerated(X) Multi-Precision library☆96Sep 9, 2016Updated 9 years ago
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- GPU Optimization and Memory Abstraction Framework☆33Oct 31, 2019Updated 6 years ago
- Code for paper "Engineering a High-Performance GPU B-Tree" accepted to PPoPP 2019☆58Jun 27, 2022Updated 3 years ago
- very fast (NOT SECURE) implementation of arithmetic on curve secp256k1 on x86_64☆24Jun 29, 2020Updated 5 years ago
- Number Theoretic Transform Implementation on GPU for FHE Applications☆44Feb 6, 2021Updated 5 years ago
- Full-speed Array of Structures access☆177Apr 25, 2023Updated 2 years ago
- A fast and unsafe version of an optimized C library for EC operations on curve secp256k1☆26Sep 18, 2023Updated 2 years ago
- Multiple-precision GPU accelerated linear algebra routines (dense and sparse) based on residue number system☆22Dec 19, 2022Updated 3 years ago
- 🚀 Sum of the primes below x☆41Jun 17, 2022Updated 3 years ago
- Optimized implementations of the Number Theoretic Transform (NTT) algorithm for the ring R/(X^N + 1) where N=2^m.☆27Nov 23, 2021Updated 4 years ago
- Serverless GPU API endpoints on Runpod - Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- A fast and highly scalable GPU dynamic memory allocator☆112Mar 11, 2015Updated 11 years ago
- Implementation of Multi-Key TFHE [KMS22]☆20Jan 24, 2024Updated 2 years ago
- A package for constructing sparse tensors from CSV-like data sources.☆11Dec 24, 2017Updated 8 years ago
- Welcome to the GPU-NTT-Optimization repository! We present cutting-edge algorithms and implementations for optimizing the Number Theoreti…☆51Feb 13, 2026Updated 2 months ago
- Python tools for NVIDIA Profiler☆21Dec 21, 2017Updated 8 years ago
- CUDA and OpenMP implementations of C2R/R2C inplace transposition☆48Feb 10, 2015Updated 11 years ago
- mallocMC: Memory Allocator for Many Core Architectures☆58Mar 20, 2026Updated 3 weeks ago
- A GPU orchestraded BIP39 mnemonic solver for BTC blockchain☆12Jul 26, 2022Updated 3 years ago
- Accelerating MSM Operations on GPU/FPGA☆15Sep 16, 2022Updated 3 years ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- Related address generator.☆30May 1, 2015Updated 10 years ago
- OPHELib is an optimized library for partially homomorphic encryption. It currently provides an implementation of the Paillier encryption …☆15May 29, 2019Updated 6 years ago
- ☆16Feb 27, 2022Updated 4 years ago
- The Miller-Rabin probabilistic primality test in C++ w/GMP☆32Oct 9, 2017Updated 8 years ago
- ☆12Dec 8, 2021Updated 4 years ago
- Network based loader and flasher for Pano G2 devices☆15Jul 8, 2023Updated 2 years ago
- RISC-V System on Chip Builder☆12Sep 27, 2020Updated 5 years ago
- Elliptic curve tools, ECDSA, and ECDSA attacks.☆40Aug 14, 2024Updated last year
- The CUDA Multiple Precision Arithmetic Library☆50Oct 14, 2012Updated 13 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Collection of CUDA benchmarks, with a focus on unified vs. explicit memory management.☆21Oct 15, 2019Updated 6 years ago
- ☆31Aug 28, 2020Updated 5 years ago
- A vectorizable multi-dimensional iterator for C++ using the Coroutines TS☆12Jun 5, 2022Updated 3 years ago
- A proposal for a standard parallel algorithms library for ISO C++.☆22Feb 28, 2014Updated 12 years ago
- Генератор BTC чеков. Скрипт генерирует ссылки для открытия BTC чеков в Telegram боте @btc_change_bot.☆10May 19, 2023Updated 2 years ago
- materials for my workshop "Latest Deep Learning Models for NLP" @ the European Open Data Science Conference 2019☆11Feb 3, 2020Updated 6 years ago
- In this repo we will construct a POC implementation of the MLE sumcheck end-end in a GPU☆41Feb 17, 2025Updated last year