An Open Source Kepler GPU Assembler
☆21Jan 23, 2017Updated 9 years ago
Alternatives and similar repositories for KeplerAs
Users that are interested in KeplerAs are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- assembler for NVIDIA FERMI. Imported from Google Code☆77Mar 22, 2015Updated 11 years ago
- Third party assembler and GEMM library for NVIDIA Kepler GPU☆85Oct 8, 2019Updated 6 years ago
- Assembler for NVIDIA Volta and Turing GPUs☆242Jan 13, 2022Updated 4 years ago
- ☆27Oct 26, 2019Updated 6 years ago
- Experiments evaluating preemption on the NVIDIA Pascal architecture☆16Nov 10, 2016Updated 9 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- ☆17Aug 9, 2022Updated 3 years ago
- An unofficial cuda assembler, for all generations of SASS, hopefully :)☆586Apr 20, 2023Updated 3 years ago
- ☆48Dec 11, 2020Updated 5 years ago
- ☆46Nov 1, 2025Updated 6 months ago
- Efficient CUDA Stream Compaction Library☆34Jun 9, 2023Updated 2 years ago
- CUDA Tensor Transpose (cuTT) library☆55Aug 10, 2017Updated 8 years ago
- NeuroSync: A Scalable and Accurate Brain Simulation System using Safe and Efficient Speculation (HPCA 2022)☆13Nov 9, 2022Updated 3 years ago
- Use tensor core to calculate back-to-back HGEMM (half-precision general matrix multiplication) with MMA PTX instruction.☆13Nov 3, 2023Updated 2 years ago
- Composable high-level instrumentation for C libraries' malloc and friends☆20Nov 15, 2025Updated 6 months ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Benchmark of TVM quantized model on CUDA☆112Jun 19, 2020Updated 5 years ago
- Flexible GPGPU instrumentation☆90Oct 10, 2019Updated 6 years ago
- Assembler for NVIDIA Maxwell architecture☆1,067Jan 3, 2023Updated 3 years ago
- A GPGPU library with OpenGL. Intended to be cross platform as possible using the Transform Feedback API.☆10Jul 30, 2018Updated 7 years ago
- Multiple 1-stencil implementations using nvidia cuda.☆12Dec 2, 2017Updated 8 years ago
- ☆12Jun 22, 2023Updated 2 years ago
- Start Scaled YOLOv4☆10Jan 9, 2021Updated 5 years ago
- A Benchmark Toolkit for Assembly Instructions Using the LLVM JIT☆18Oct 26, 2020Updated 5 years ago
- chef cookbook to install Apache Spark☆10Jul 17, 2015Updated 10 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- nvptx-tools: a collection of tools for use with nvptx-none GCC toolchains.☆52Apr 7, 2026Updated last month
- An open-source framework for optimizing binary image processing algorithms.☆16Feb 25, 2021Updated 5 years ago
- C library plusifier☆11Nov 13, 2021Updated 4 years ago
- Relief Mapping Demo☆13Aug 18, 2011Updated 14 years ago
- This is a demo how to write a high performance convolution run on apple silicon☆56Feb 8, 2022Updated 4 years ago
- Proof of concept prototype to perform distributed training using BVLC/caffe, based on a parameter server implementation using MPI. Data p…☆13May 7, 2015Updated 11 years ago
- Decuda and cudasm, the CUDA binary utilities package. Low-level tools for NVidia G80 GPUs.☆107Jul 24, 2010Updated 15 years ago
- Implementation of the Barnes-Hut algorithm in C++☆12Jul 8, 2011Updated 14 years ago
- An unofficial cuda assembler, for all generations of SASS, hopefully :)☆85Mar 20, 2023Updated 3 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- ☆10Apr 24, 2023Updated 3 years ago
- GPU implementation of Winograd convolution☆10Oct 23, 2017Updated 8 years ago
- [IJCAI 2025] Optimized View and Geometry Distillation from Multi-view Diffuser☆18May 2, 2025Updated last year
- This project is to identify buildings in satellite using Unet and masking method☆11Apr 12, 2026Updated last month
- Several common methods of matrix multiplication are implemented on CPU and Nvidia GPU using C++11 and CUDA.☆14Feb 8, 2023Updated 3 years ago
- Yet Another Indent Finder, Almost...☆22Apr 10, 2020Updated 6 years ago
- A GCC plugin to insert pytest-like assert introspections☆19Jun 6, 2020Updated 5 years ago