Comprehensive CUDA tutorials for Maths & ML with examples
☆232Jun 11, 2025Updated last year
Alternatives and similar repositories for cuda-tutorials
Users that are interested in cuda-tutorials are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆22May 26, 2025Updated last year
- Make triton easier☆50Jun 12, 2024Updated 2 years ago
- 珠算代码大模型(Abacus Code LLM)☆58Sep 26, 2024Updated last year
- Fetch arxiv data to LLM-friendly text☆132Feb 18, 2026Updated 4 months ago
- a collection of resources around LLMs, aggregated for the workshop "Mastering LLMs: End-to-End Fine-Tuning and Deployment" by Dan Becker …☆108May 31, 2024Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Rust implementation of Surya☆67Mar 1, 2025Updated last year
- Track and Collaborate on ML & AI Experiments.☆44Mar 10, 2025Updated last year
- Repository containing awesome resources regarding Hugging Face tooling.☆49Jan 8, 2024Updated 2 years ago
- High-Performance C++ Fundamental Library☆649Mar 16, 2026Updated 3 months ago
- Recreating PyTorch from scratch (C/C++, CUDA, NCCL and Python, with multi-GPU support and automatic differentiation!)☆166Nov 25, 2025Updated 7 months ago
- This is an example of creating an AI agent with flowchart☆12Jul 22, 2024Updated last year
- The official implementation of the paper "Self-Updatable Large Language Models by Integrating Context into Model Parameters"☆15May 18, 2025Updated last year
- JAX library for training sub-4B foundation models for edge☆302Aug 28, 2024Updated last year
- Utilities for efficient fine-tuning, inference and evaluation of code generation models☆21Oct 3, 2023Updated 2 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- The GPU RAM Estimator provides a simple tool for estimating GPU memory usage during training and inference.☆35Apr 9, 2024Updated 2 years ago
- Code for MM-DINOv2: Adapting Foundation Models for Multi-Modal Medical Image Analysis (MICCAI2025)☆26Oct 27, 2025Updated 8 months ago
- An implementation of the Llama architecture, to instruct and delight☆21May 31, 2025Updated last year
- Samples for CUDA Developers which demonstrates features in CUDA Toolkit☆9,340May 27, 2026Updated last month
- A streamlined, user-friendly JSON streaming preprocessor, crafted in Python.☆116Sep 20, 2024Updated last year
- ☆51May 31, 2024Updated 2 years ago
- Open deep learning compiler stack for cpu, gpu and specialized accelerators☆19Updated this week
- ☆13Jul 15, 2021Updated 4 years ago
- A fast RWKV Tokenizer written in Rust☆54Aug 12, 2025Updated 10 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Llama3-Tutorial(XTuner、LMDeploy、OpenCompass)☆508May 10, 2024Updated 2 years ago
- ☆33Nov 4, 2024Updated last year
- Learning records for building a large language model from scratch☆58Jan 1, 2025Updated last year
- 生成自动滚动的视频分镜头拆解表格☆16Jul 25, 2024Updated last year
- 愚公wiki是一款轻量的在线博客、知识库、个人笔记或企业文档协作平台,可下载桌面版作为个人笔记本,也可以在线编辑文档,当然也可以自行进行服务化部署,因为这是一款完全开源的写作平台☆17Jul 22, 2024Updated last year
- Multi-threaded matrix multiplication and cosine similarity calculations for dense and sparse matrices. Appropriate for calculating the K …☆85Dec 28, 2024Updated last year
- Official Implementation of UTrice: Unifying Primitives in Differentiable Ray Tracing and Rasterization via Triangles for Particle-Based 3…☆30Jan 13, 2026Updated 5 months ago
- 🟠 A study guide to learn about Graph Neural Networks (GNNs)☆1,313Jan 6, 2023Updated 3 years ago
- implement a simple jvm with java☆104Mar 7, 2024Updated 2 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- ☆53Nov 14, 2024Updated last year
- Implementation of Direct Preference Optimization☆17Jul 17, 2023Updated 2 years ago
- Applied AI experiments and examples for PyTorch☆323Aug 22, 2025Updated 10 months ago
- 😜 表情包视觉数据集,使用glm-4v、step-1v的图像解析能力标注。☆149Apr 27, 2024Updated 2 years ago
- Universal Neurons in GPT2 Language Models☆30May 28, 2024Updated 2 years ago
- HunyuanVideo: A Systematic Framework For Large Video Generation Model☆48Dec 14, 2024Updated last year
- GPU programming related news and material links☆2,206Jun 15, 2026Updated 2 weeks ago