☆23Feb 16, 2022Updated 4 years ago
Alternatives and similar repositories for CUDACommunityMeetup2021
Users that are interested in CUDACommunityMeetup2021 are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆14Apr 24, 2024Updated 2 years ago
- RAPIDS Deployment Documentation☆15Jun 3, 2026Updated last week
- A intelligent matrix format designer for SpMV☆10Oct 10, 2023Updated 2 years ago
- Generate simple index ranges in C++ and CUDA C++☆39Jun 14, 2023Updated 2 years ago
- My study notes and hands-on projects for CUDA-based GPU programming☆12Dec 11, 2025Updated 5 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- ☆12Dec 21, 2023Updated 2 years ago
- Comparing Deep Learning Inference of Pytorch models running on CPU, CUDA and TensorRT☆17Feb 20, 2022Updated 4 years ago
- Utilities for accessing AMD's Machine-Readable GPU ISA Specifications.☆52Apr 9, 2026Updated 2 months ago
- ☆12Jan 19, 2020Updated 6 years ago
- ❤️ CUDA/C++ GPU graph analytics simplified.☆32Sep 19, 2022Updated 3 years ago
- Generic exascale-ready library for halo-exchange operations on variety of grids/meshes☆11May 28, 2026Updated last week
- Advanced Parallel Programming☆21Mar 16, 2021Updated 5 years ago
- End to End steps for adding custom ops in PyTorch.☆24Aug 20, 2020Updated 5 years ago
- PaStiX (Parallel Sparse matriX package) solver library☆20Nov 20, 2018Updated 7 years ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- ☆14Apr 10, 2023Updated 3 years ago
- CUDA executors☆14Dec 4, 2020Updated 5 years ago
- An extension library of WMMA API (Tensor Core API)☆113Jul 12, 2024Updated last year
- A CUDA implementation of the Tsetlin Machine based on bitwise operators☆26Aug 19, 2019Updated 6 years ago
- Parallel_Computer_Architecture经典书籍☆17May 13, 2022Updated 4 years ago
- Cuda matrix computation library that is specified for small matrix operation (3x3, 4x4, 1x3, 1x4, etc.). Including buffer☆19Mar 8, 2024Updated 2 years ago
- fork of NVIDIA's cudaraster for research use, taken from http://code.google.com/p/cudaraster/☆31Dec 31, 2012Updated 13 years ago
- A parser for PTX 6.5☆13Jun 19, 2023Updated 2 years ago
- Code samples for the CUDA tutorial "CUDA and Applications to Task-based Programming"☆98Aug 14, 2023Updated 2 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- ☆115Apr 19, 2024Updated 2 years ago
- ☆11Jun 9, 2023Updated 3 years ago
- A notebook testing CPU speed vs GPU speed with Pytorch and CUDA☆18Dec 25, 2021Updated 4 years ago
- CUDA kernel author's tools☆116Apr 24, 2022Updated 4 years ago
- 一个用Apple Metal实现的Llama和通义千问大模型本地推理☆10Apr 26, 2024Updated 2 years ago
- Global Address SPace toolbox -- Julia wrapper☆10Nov 17, 2017Updated 8 years ago
- 🎃 GPU load-balancing library for regular and irregular computations.☆66May 11, 2026Updated last month
- Runs a single CUDA/OpenCL kernel, taking its source from a file and arguments from the command-line☆26May 17, 2026Updated 3 weeks ago
- 自动识别文本中的关键词并加粗处理。☆10Oct 30, 2024Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- An LLM-powered chatbot for fediverse. A tech demo for BotKit.☆15Dec 20, 2025Updated 5 months ago
- A Lightweight Graph Processing Framework for Multi-GPUs☆14Apr 15, 2015Updated 11 years ago
- Range-based for loops to iterate over a range of numbers or values☆34Nov 23, 2016Updated 9 years ago
- Website for CS 265☆34Dec 27, 2024Updated last year
- Implementations of 2D Image Convolution algorithm with CUDA (using global memory, shared memory and constant memory)☆17Jan 21, 2018Updated 8 years ago
- An FPGA design for simulating biological neurons☆18Jul 5, 2024Updated last year
- GPU model checker☆13Apr 17, 2019Updated 7 years ago