hbchen121 / SimpleCNN_Release
pure c/cpp cnn implementation, with CUDA accelerated.
☆19Updated 3 years ago
Related projects ⓘ
Alternatives and complementary repositories for SimpleCNN_Release
- ☆220Updated last month
- Personal Notes for Learning HPC & Parallel Computation [Active Adding New Content]☆59Updated 2 years ago
- Algorithm course at UCAS☆29Updated 2 months ago
- 使用OpenMP及MPI完成的几个并行程序设计小实验:矩阵相乘、矩阵LU分解、文档分类中的文档向量过程☆27Updated 3 years ago
- 高性能计算课程&CUDA编程实例&深度学习推理框架☆29Updated last year
- ☆29Updated last year
- 大规模并行处理器编程实战 第二版答案☆27Updated 2 years ago
- 中国科学院大学 高性能计算系统2021春☆8Updated 3 years ago
- 【2024年新版】国科大 陈云霁 智能计算系统AICS实验代码☆175Updated 5 months ago
- Machine Learning Compiler Road Map☆41Updated last year
- 智能计算系统 AI Computing Systems 陈云霁☆123Updated last year
- hands on model tuning with TVM and profile it on a Mac M1, x86 CPU, and GTX-1080 GPU.☆39Updated last year
- C++ implement a simple CNN framework to train mnist data. Done!☆11Updated 2 years ago
- UCAS High Performance Computing System 国科大高性能计算系统复习及试题☆12Updated 2 years ago
- Using TVM to depoly Transformer on CPU and GPU☆11Updated 3 years ago
- 国科大高性能计算机系统课程源代码☆11Updated 4 years ago
- ☆51Updated 2 years ago
- 校招、秋招、春招、实习好项目,带你从零动手实现支持LLama2/3和Qwen2.5的大模型推理框架。☆220Updated last week
- ☆109Updated 2 years ago
- ☆97Updated 7 months ago
- ☆15Updated 5 years ago
- ☆38Updated 2 years ago
- CUDA 6大并行计算模式 代码与笔记☆58Updated 4 years ago
- A minimalist and extensible PyTorch extension for implementing custom backend operators in PyTorch.☆28Updated 7 months ago
- Parallel Prefix Sum (Scan) with CUDA☆16Updated 4 months ago
- 高性能计算相关知识学习笔记,包含学习笔记和相关知识的代码demo,在持续完善中。 如果有帮助的话请Star一下,对作者帮助很大,谢谢!☆372Updated last year
- A tutorial for CUDA&PyTorch☆117Updated 2 weeks ago
- ☆82Updated 6 months ago
- A CUDA tutorial to make people learn CUDA program from 0☆195Updated 4 months ago
- 《CUDA编程基础与实践》一书的代码☆95Updated 2 years ago