🎉CUDA 笔记 / 高频面试题汇总 / C++笔记,个人笔记,更新随缘: sgemm、sgemv、warp reduce、block reduce、dot product、elementwise、softmax、layernorm、rmsnorm、hist etc.
☆47Feb 23, 2024Updated 2 years ago
Alternatives and similar repositories for CUDA-Learn-Note
Users that are interested in CUDA-Learn-Note are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆23Aug 14, 2024Updated last year
- 🐱 ncnn int8 模型量化评估☆14Oct 10, 2022Updated 3 years ago
- 🎉CUDA 笔记 / 高频面试题汇总 / C++笔记,个人笔记,更新随缘: sgemm、sgemv、warp reduce、block reduce、dot product、elementwise、softmax、layernorm、rmsnorm、hist etc.☆43Jan 25, 2024Updated 2 years ago
- Python bindings to the PSRDada ringbuffer implementation☆11Jan 30, 2024Updated 2 years ago
- BOOM's Simulation Accelerator.☆13Dec 16, 2021Updated 4 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- A 3D fluid simulation on the GPU using C++ and Vulkan.☆13Jun 12, 2022Updated 3 years ago
- JSF 2 TagLib for Apache Shiro. This taglib reimplements all original JSP tags as their Facelets equivalent, so they can be used in JSF pr…☆29Oct 5, 2016Updated 9 years ago
- ☆12Apr 16, 2024Updated last year
- GEMM by WMMA (tensor core)☆15Jul 31, 2022Updated 3 years ago
- StarPU Runtime system☆16Sep 22, 2010Updated 15 years ago
- 记录SpringBoot学习☆12Jun 17, 2022Updated 3 years ago
- 🚀全流程自己训练一个VLA 「大模型」1小时从0训练26M参数的视觉多模态VLM!🌏 Train a 26M-parameter VLM from scratch in just 1 hours!☆30Oct 16, 2025Updated 5 months ago
- An implementation of "Air Meshes for Robust Collision Handling", SIGGRAPH (2015)☆16Aug 30, 2017Updated 8 years ago
- NeurIPS 2020 Spotlight Paper☆13Dec 20, 2021Updated 4 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Work in progress object cutting based on Nvidia Flex☆14May 17, 2021Updated 4 years ago
- About the source code of "Merging Similar Patterns for Hardware Prefetching" paper, which is accepted in MICRO 2022.☆14Mar 1, 2023Updated 3 years ago
- Official demo repo of CSC4005 Parallel Programming 2022 Fall @ CUHK(Shenzhen)☆17Aug 6, 2023Updated 2 years ago
- [ICML'25] Breaking Silos: Adaptive Model Fusion Unlocks Better Time Series Forecasting | 样本级别的自适应多模型集成时间序列预测☆27May 22, 2025Updated 10 months ago
- Parallel Prefix Sum (Scan) with CUDA☆29Jun 22, 2024Updated last year
- molecular dynamics (MD) simulation of 10^13 atoms.☆12Nov 22, 2024Updated last year
- This is the implementation repository of our ICSE'22 paper: Muffin: Testing Deep Learning Libraries via Neural Architecture Fuzzing.☆33Jun 17, 2022Updated 3 years ago
- ☆14May 14, 2024Updated last year
- Mitigation of periodic as well as narrow-band and spiky/bursty RFI from time-domain filterbank data.☆18Apr 23, 2021Updated 4 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- jump to a place when progam runs to the max instruction number☆15Dec 14, 2023Updated 2 years ago
- Unstructured computations on emerging architectures.☆14Jun 1, 2022Updated 3 years ago
- A large-scale training and benchmarking framework for rPPG.☆10Nov 26, 2024Updated last year
- This repository contains the source codes for the paper: "SPACE: A Simulator for Physical Interactions and Causal Learning in 3D Environm…☆16Oct 11, 2021Updated 4 years ago
- PhysWorld: From Real Videos to World Models of Deformable Objects via Physics-Aware Demonstration Synthesis☆36Oct 27, 2025Updated 5 months ago
- Accelerating Multitask Training Trough Adaptive Transition [Efficient ML Model]☆12May 23, 2025Updated 10 months ago
- A C++ port of karpathy/micrograd, a tiny scalar-valued autograd engine and a neural net library☆13Nov 24, 2023Updated 2 years ago
- A simple Online Judge core☆17May 16, 2019Updated 6 years ago
- ☆23May 10, 2023Updated 2 years ago
- NordVPN Threat Protection Pro™ • AdTake your cybersecurity to the next level. Block phishing, malware, trackers, and ads. Lightweight app that works with all browsers.
- Multi-GPU Framework for Voxel Grid Computations☆66Mar 26, 2026Updated 2 weeks ago
- 安卓期末大作业——新闻阅读App☆10Jun 6, 2016Updated 9 years ago
- Implementation of analytic collision penalty eigensystems (with Matlab)☆19Oct 23, 2025Updated 5 months ago
- taichi hackathon repo.☆18Dec 15, 2022Updated 3 years ago
- A reference implementation of "WRAPD: Weighted Rotation-aware ADMM for Parameterization and Deformation" written in C++. This code suppor…☆13Aug 9, 2021Updated 4 years ago
- A Winograd Minimal Filter Implementation in CUDA☆28Aug 25, 2021Updated 4 years ago
- CoRdE model implementation: simulating ropes, chains, and other elastic strings☆11May 8, 2020Updated 5 years ago