BrunoMeyer / gpu-rsfk
A GPU (CUDA) implementation, with a python interface, of the approximated KNN graph computation with Random Sample Forest algorithm KNN.
☆12Updated 4 months ago
Alternatives and similar repositories for gpu-rsfk:
Users that are interested in gpu-rsfk are comparing it to the libraries listed below
- Near-storage compute aware file system and FPGA operator pipelines.☆29Updated 3 years ago
- Artifact for IPDPS'21: DSXplore: Optimizing Convolutional Neural Networks via Sliding-Channel Convolutions.☆13Updated 4 years ago
- Artifacts for SOSP'19 paper Optimizing Deep Learning Computation with Automatic Generation of Graph Substitutions☆21Updated 3 years ago
- This is a repo which contains some details about how to use OpenCL backend (Xilinx/Intel).☆24Updated 5 years ago
- SmartNIC☆14Updated 6 years ago
- ☆13Updated 4 years ago
- SMASH is a hardware-software cooperative mechanism that enables highly-efficient indexing and storage of sparse matrices. The key idea of…☆16Updated 4 years ago
- A source-to-source compiler for optimizing CUDA dynamic parallelism by aggregating launches☆15Updated 5 years ago
- This is the course project for CSCE585: ML Systems. Students will build their machine learning systems based on the provided infrastructu…☆13Updated 4 years ago
- Arrow Matrix Decomposition - Communication-Efficient Distributed Sparse Matrix Multiplication☆16Updated last year
- A Distributed Multi-GPU System for Fast Graph Processing☆65Updated 6 years ago
- ☆21Updated 2 years ago
- Accelerator simulation framework using nn_dataflow traces and energy, etc. post-processing☆7Updated 6 years ago
- A Dataflow library for graph analytics acceleration☆14Updated 9 years ago
- An Architecture-level Fault Injection Tool for GPU Application Resilience Evaluations☆16Updated 5 years ago
- An implementation of a BinaryConnect network for cifar10☆11Updated 5 years ago
- Public Release of Stream-Dataflow☆14Updated 5 years ago
- Multi-armed bandit algorithm with tensorflow and 11 policies☆14Updated 2 years ago
- An Attention Superoptimizer☆21Updated 3 months ago
- An external memory allocator example for PyTorch.☆14Updated 3 years ago
- TAPA is a dataflow HLS framework that features fast compilation, expressive programming model and generates high-frequency FPGA accelerat…☆19Updated 8 months ago
- Part of paper: Massively Parallel Combinational Binary Neural Networks for Edge Processing☆12Updated 5 years ago
- Convert C files into Verilog☆16Updated 6 years ago
- ColTraIn HBFP Training Emulator☆16Updated 2 years ago
- An FPGA integration and acceleration of the popular FAISS framework for approximate similarity search☆23Updated 5 years ago
- SimFS: A Virtualizing Simulation Data File System Interface☆8Updated 5 years ago
- A package for constructing sparse tensors from CSV-like data sources.☆11Updated 7 years ago
- A Vector Caching Scheme for Streaming FPGA SpMV Accelerators☆10Updated 9 years ago
- ☆21Updated 2 months ago
- A fast implementation of spectral clustering on GPU-CPU Platform☆31Updated 6 years ago