高性能计算课程&CUDA编程实例&深度学习推理框架
☆71Sep 21, 2023Updated 2 years ago
Alternatives and similar repositories for HPC
Users that are interested in HPC are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- 国科大高性能计算机系统课程源代码☆12Jun 17, 2020Updated 5 years ago
- Website for CSE 234, Winter 2025☆13Mar 24, 2025Updated last year
- 目标检测+目标跟踪+单目测距+姿态识别+车道线识别+车牌识别+A star算法+车辆跟踪与测距等视觉项目☆14Dec 12, 2024Updated last year
- Fxxk PingAnChengDian☆17Nov 17, 2021Updated 4 years ago
- HPC-roadmap for 2021 recruitment☆47Sep 19, 2025Updated 6 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- hpc-learning☆783May 30, 2024Updated last year
- a lightweight C++ LLaMA inference engine for mobile devices☆15Oct 28, 2023Updated 2 years ago
- GEMM by WMMA (tensor core)☆15Jul 31, 2022Updated 3 years ago
- 这个项目介绍了简单的CUDA入门,涉及到CUDA执行模型、线程层次、CUDA内存模型、核函数的编写方式以及PyTorch使用CUDA扩展的两种方式。通过该项目可以基本入门基于PyTorch的CUDA扩展的开发方式。☆95Nov 12, 2021Updated 4 years ago
- ☆15Feb 20, 2024Updated 2 years ago
- Solutions for http://minitorch.github.io☆15Jun 13, 2022Updated 3 years ago
- ☆16Jun 25, 2024Updated last year
- 2020秋中山大学高性能计算课程课件与作业☆47Jan 25, 2021Updated 5 years ago
- 使用 Rust 语言重新实现 https://github.com/zjhellofss/KuiperInfer 和 https://github.com/zjhellofss/kuiperdatawhale 中的深度学习推理框架。☆17Apr 9, 2024Updated 2 years ago
- Wordpress hosting with auto-scaling on Cloudways • AdFully Managed hosting built for WordPress-powered businesses that need reliable, auto-scalable hosting. Cloudways SafeUpdates now available.
- TensorRT实现部署YOLOV5☆13Jul 30, 2022Updated 3 years ago
- 学习CUDA编程基础☆15Jun 27, 2019Updated 6 years ago
- libxco是一个轻量级高性能协程网络库☆12Jul 10, 2025Updated 9 months ago
- bilibili视频【CUDA 12.x 并行编程入门(C++版)】配套代码☆34Aug 12, 2024Updated last year
- ☆14Sep 25, 2025Updated 6 months ago
- Official repository for "Vid2World: Crafting Video Diffusion Models to Interactive World Models" (ICLR 2026), https://arxiv.org/abs/2505.…☆49Jan 27, 2026Updated 2 months ago
- PID controller with FPGA hardware☆21Oct 5, 2019Updated 6 years ago
- 高性能计算相关知识学习笔记,包含学习笔记和相关知识的代码demo,在持续完善中。 如果有帮助的话请Star一下,对作者帮助很大,谢谢!☆473Mar 28, 2023Updated 3 years ago
- [AAAI 2026] WorldRFT: Latent World Model Planning with Reinforcement Fine-Tuning for Autonomous Driving☆34Dec 23, 2025Updated 3 months ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- Python implementation of A Complex Quasi-Newton Proximal Method for Image Reconstruction in Compressed Sensing MRI paper☆12Oct 19, 2025Updated 5 months ago
- Advanced Programming - HW3☆16Mar 23, 2022Updated 4 years ago
- This project leverages Large Language Models (LLMs) and a multi-agent framework to analyze stock prices, gather relevant news, and genera…☆23Feb 26, 2025Updated last year
- 基于《cuda编程-基础与实践》(樊哲勇 著)的cuda学习之路。☆413Jan 15, 2024Updated 2 years ago
- This repository contains the details of controlling a 3-DOF robotic arm with 4 servo motors using FPGA. The design is execute using the N…☆22May 5, 2022Updated 3 years ago
- ☆13Dec 29, 2020Updated 5 years ago
- DGEMM on KNL, achieve 75% MKL☆19May 19, 2022Updated 3 years ago
- 校招、秋招、春招、实习好项目,带你从零动手实现支持LLama2/3和Qwen2.5的大模型推理框架。☆528Oct 28, 2025Updated 5 months ago
- mini分布式key-value存储引擎,主要涵盖: RPC,一致性Hash,master-slave-client架构,心跳机制等基本已经实现,主备结构, 数据迁移,数据副本等一些分布式主要功能将在以后添加☆18Nov 15, 2015Updated 10 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- 校招、秋招、春招、实习好项目!带你从零实现一个高性能的深度学习推理库,支持大模型 llama2 、Unet、Yolov5、Resnet等模型的推理。Implement a high-performance deep learning inference library st…☆3,386Jun 22, 2025Updated 9 months ago
- 这是一个使用opencv读取视频并使用socket进行传输视频画面的脚本文件,相较于调用ffmpeg传输节约了90%的数据量☆11May 14, 2024Updated last year
- ☆46Sep 8, 2025Updated 7 months ago
- FPGA version of CORDIC algorithm that evaluates all the trigonometric and anti-trigonometric functions.☆24Nov 20, 2019Updated 6 years ago
- ☆11Oct 18, 2022Updated 3 years ago
- ☆12Feb 18, 2014Updated 12 years ago
- MIT6.824分布式系统的C++版本实现,能学到分布式系统设计、rpc使用、mapreduce、基本数据库设计、raft算法、分布式一致性等后端知识点,结合了linux系统的许多系统调用☆58Apr 29, 2023Updated 2 years ago