CisMine / Setup-as-Cuda-programmers
Setup Cuda
☆21Updated 10 months ago
Alternatives and similar repositories for Setup-as-Cuda-programmers:
Users that are interested in Setup-as-Cuda-programmers are comparing it to the libraries listed below
- NVIDIA tools guide☆119Updated 2 months ago
- CUDA Learning guide☆349Updated 9 months ago
- Read custom dataset☆11Updated 2 years ago
- Implement Neural Networks in Cuda from Scratch☆22Updated 10 months ago
- Learning about CUDA by writing PTX code.☆125Updated last year
- Apply GPU in ML and DL☆48Updated last month
- Examples from Programming in Parallel with CUDA☆131Updated 2 years ago
- A blog where I write about research papers and blog posts I read.☆12Updated 4 months ago
- Code samples for the CUDA tutorial "CUDA and Applications to Task-based Programming"☆89Updated last year
- UNet diffusion model in pure CUDA☆600Updated 9 months ago
- ☆152Updated last year
- Some CUDA example code with READMEs.☆93Updated last month
- ☆47Updated this week
- Write a fast kernel and run it on Discord. See how you compare against the best!☆35Updated this week
- PTX-Tutorial Written Purely By AIs (Deep Research of Openai and Claude 3.7)☆62Updated last week
- High-Performance SGEMM on CUDA devices☆88Updated 2 months ago
- A curated collection of resources, tutorials, and best practices for learning and mastering NVIDIA CUTLASS☆154Updated last week
- CUDA Matrix Multiplication Optimization☆177Updated 8 months ago
- From zero to hero CUDA for accelerating maths and machine learning on GPU.☆181Updated last week
- ☆13Updated 3 weeks ago
- A Survey Analyzing Generalization in Deep Reinforcement Learning☆32Updated 5 months ago
- ☆87Updated last year
- ☆142Updated 3 months ago
- CUDA by Example, written by two senior members of the CUDA software platform team, shows programmers how to employ this new technology. …☆404Updated last year
- Fast CUDA matrix multiplication from scratch☆673Updated last year
- ML/DL Math and Method notes☆59Updated last year
- This repository contain the simple llama3 implementation in pure jax.☆58Updated last month
- ☆58Updated 4 months ago
- A set of hands-on tutorials for CUDA programming☆218Updated 11 months ago
- Custom kernels in Triton language for accelerating LLMs☆18Updated 11 months ago