Will write CUDA for 100 days
☆39May 25, 2025Updated 11 months ago
Alternatives and similar repositories for 100-days-of-cuda
Users that are interested in 100-days-of-cuda are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A std::execution style runtime context and High Performance RPC Transport for using OpenUCX. Including CUDA/ROCM/... devices with RDMA.☆30Updated this week
- 个人学习编译原理、理解创造一个编译器主体流程的小项目☆10Oct 7, 2020Updated 5 years ago
- A c++ client library for redis cluster.☆14Mar 9, 2016Updated 10 years ago
- Linux from beginner to master☆32Dec 4, 2025Updated 5 months ago
- ☆45May 4, 2025Updated last year
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Go和大语言模型编程☆44Mar 5, 2025Updated last year
- Explore training for quantized models☆26Jul 12, 2025Updated 10 months ago
- Row-wise block scaling for fp8 quantization matrix multiplication. Solution to GPU mode AMD challenge.☆19Feb 9, 2026Updated 3 months ago
- Implementation of an RL based agent, which utilizes Q-Learning to develop a policy for effectively solving a 3x3x3 rubiks cube☆19Mar 12, 2019Updated 7 years ago
- GEMM☆10Aug 26, 2023Updated 2 years ago
- Compile & run a single CUDA file on the cloud GPUs☆14Sep 8, 2024Updated last year
- ☆44Mar 11, 2026Updated 2 months ago
- ☆11Updated this week
- 汇编语言学习的例子☆10Aug 5, 2021Updated 4 years ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- 🎓Automatically Update circult-eda-mlsys-tinyml Papers Daily using Github Actions (Update Every 8th hours)☆10May 14, 2026Updated last week
- ☆48Mar 27, 2023Updated 3 years ago
- GEMV implementation with CUTLASS☆21Aug 21, 2025Updated 9 months ago
- ☆14Nov 3, 2025Updated 6 months ago
- Multi-heap-sort for many small arrays, quicksort with 3 pivots for one big array, CUDA acceleration, CUDA memory compression.☆13Sep 29, 2024Updated last year
- This project aims to provide a high effective KV cache manage framework for llm inference and improve memory utilization and inference sp…☆53Apr 24, 2026Updated 3 weeks ago
- DoubleAI’s hyperoptimised version of cuGraph☆52Mar 3, 2026Updated 2 months ago
- 。☆13Jan 15, 2022Updated 4 years ago
- ☆18Nov 22, 2025Updated 5 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- 中国大学MOOC-浙江大学-翁恺老师网课-C语言程序设计,我从零开始自学编程的记录。☆16May 18, 2020Updated 6 years ago
- ☆32Jul 2, 2025Updated 10 months ago
- Fast GPU based tensor core reductions☆13Jan 13, 2023Updated 3 years ago
- portFFT is a library implementing Fast Fourier Transforms using SYCL☆19Mar 1, 2025Updated last year
- All Resources from Stanford CS106B 2021☆26Jul 11, 2025Updated 10 months ago
- Phoom 3D VR AR Conferencing App☆16May 12, 2020Updated 6 years ago
- Toy vector database written in c99.☆25Sep 5, 2024Updated last year
- Open deep learning compiler stack for cpu, gpu and specialized accelerators☆19May 13, 2026Updated last week
- ☆12Aug 31, 2023Updated 2 years ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Frechet Video Distance metric implemented on PyTorch☆34Mar 22, 2020Updated 6 years ago
- [ICML 2025] SparseLoRA: Accelerating LLM Fine-Tuning with Contextual Sparsity☆75Mar 10, 2026Updated 2 months ago
- Experimental GPU language with meta-programming☆31Sep 6, 2024Updated last year
- ☆15Mar 23, 2022Updated 4 years ago
- ☆68May 23, 2025Updated 11 months ago
- ☆12Sep 2, 2025Updated 8 months ago
- multi-elevator System☆13Oct 24, 2017Updated 8 years ago