NVlabs / CGBNLinks
CGBN: CUDA Accelerated Multiple Precision Arithmetic (Big Num) using Cooperative Groups
☆231Updated 9 months ago
Alternatives and similar repositories for CGBN
Users that are interested in CGBN are comparing it to the libraries listed below
Sorting:
- A 128 bit unsigned integer class for CUDA☆46Updated 11 months ago
- CUDA accelerated(X) Multi-Precision library☆92Updated 9 years ago
- Extended-precision modular arithmetic library that targets CUDA.☆40Updated 2 years ago
- The CUDA Multiple Precision Arithmetic Library☆49Updated 13 years ago
- Intel Homomorphic Encryption Acceleration Library accelerates modular arithmetic operations used in homomorphic encryption by leveraging …☆251Updated 5 months ago
- Number Theoretic Transform Implementation on GPU for FHE Applications☆44Updated 4 years ago
- Extended-precision modular arithmetic library that targets CUDA.☆44Updated 5 years ago
- Optimizing compiler for Fully Homomorphic Encryption (FHE)☆79Updated last year
- CUDA Homomorphic Encryption Library☆207Updated 8 years ago
- CUDA-accelerated Fully Homomorphic Encryption Library☆237Updated 4 years ago
- Welcome to the GPU-NTT-Optimization repository! We present cutting-edge algorithms and implementations for optimizing the Number Theoreti…☆45Updated 4 months ago
- Short examples illustrating AVX2 intrinsics for simple tasks.☆98Updated last year
- A compiler for homomorphic encryption☆624Updated this week
- A Library for fast Hash Tables on GPUs☆129Updated 2 months ago
- ☆15Updated 3 years ago
- Samples demonstrating how to use the Compute Sanitizer Tools and Public API☆90Updated 2 years ago
- My notes on various HPC papers.☆24Updated 2 years ago
- ☆35Updated 2 years ago
- Intel® Extension for MLIR. A staging ground for MLIR dialects and tools for Intel devices using the MLIR toolchain.☆145Updated last week
- ☆269Updated last week
- NUMA-aware multi-CPU multi-GPU data transfer benchmarks☆26Updated 2 years ago
- Optimized implementations of the Number Theoretic Transform (NTT) algorithm for the ring R/(X^N + 1) where N=2^m.☆25Updated 4 years ago
- 🎃 GPU load-balancing library for regular and irregular computations.☆63Updated 3 months ago
- ☆601Updated last week
- ☆315Updated last month
- ☆289Updated 2 months ago
- The missing pieces (as far as boilerplate reduction goes) of the upstream MLIR python bindings.☆115Updated last month
- ANT-ACE: Advanced Compiler Ecosystem for Fully Homomorphic Encryption and Domain Specific Computing☆51Updated last month
- Kernel Tuner☆375Updated last week
- Assembler for NVIDIA Volta and Turing GPUs☆234Updated 3 years ago