some hpc project for learning
☆27Aug 28, 2024Updated last year
Alternatives and similar repositories for hpc_project
Users that are interested in hpc_project are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆34Apr 21, 2026Updated 2 weeks ago
- ☆10Jun 6, 2023Updated 2 years ago
- 算子库☆17Jul 9, 2025Updated 9 months ago
- Multiple GEMM operators are constructed with cutlass to support LLM inference.☆20Aug 3, 2025Updated 9 months ago
- 训练营讲义☆21Jan 21, 2025Updated last year
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- ☆49Mar 4, 2026Updated 2 months ago
- Optimize GEMM with tensorcore step by step☆37Dec 17, 2023Updated 2 years ago
- Multiscale Voxelization Operator CUDA Implementation for LiDAR Point Cloud☆21Jul 19, 2023Updated 2 years ago
- Build CUDA Neural Network From Scratch☆22Aug 28, 2024Updated last year
- easy cuda code☆97Dec 24, 2024Updated last year
- ☆28Aug 9, 2025Updated 8 months ago
- ☆11Mar 9, 2022Updated 4 years ago
- 注释的nano_vllm仓库,并且完成了MiniCPM4的适配以及注册新模型的功能☆183Aug 11, 2025Updated 8 months ago
- Möbius Transformation for Fast Inner Product Search on Graph☆22Jun 3, 2021Updated 4 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- 校招、秋招、春招、实习好项目,带你从零动手实现支持LLama2/3和Qwen2.5的大模型推理框架。☆532Oct 28, 2025Updated 6 months ago
- CSAPP3e Course Labs Files☆10Oct 9, 2020Updated 5 years ago
- Fast Synchronization-Free Algorithms for Parallel Sparse Triangular Solves with Multiple Right-Hand Sides (SpTRSM)☆17Feb 14, 2020Updated 6 years ago
- Cluster simulator with far memory☆12Apr 28, 2020Updated 6 years ago
- CUDA 8-bit Tensor Core Matrix Multiplication based on m16n16k16 WMMA API☆37Sep 15, 2023Updated 2 years ago
- Implementation and optimization of matrix multiplication on single CPU (HPC-THU-2023-Autumn)☆18Feb 27, 2024Updated 2 years ago
- Enhancing Recipe Retrieval with Foundation Models: A Data Augmentation Perspective☆15Oct 22, 2024Updated last year
- ☆12Dec 17, 2023Updated 2 years ago
- ☆18May 10, 2023Updated 2 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- ToyLLM: Learning LLM from Scratch☆25Apr 27, 2026Updated last week
- Created a simple neural network using C++17 standard and the Eigen library that supports both forward and backward propagation.☆11Jul 27, 2024Updated last year
- Simple and efficient memory pool is implemented with C++11.☆10Jun 2, 2022Updated 3 years ago
- langgraph的deepagent源码分析☆16Jan 1, 2026Updated 4 months ago
- ☆19Apr 5, 2024Updated 2 years ago
- ☆19May 31, 2023Updated 2 years ago
- ☆12Mar 13, 2023Updated 3 years ago
- Reimplementation of some fundamental sampling-based arm planning algorithms☆12Dec 30, 2022Updated 3 years ago
- Artifacts of EVT ASPLOS'24☆30Mar 6, 2024Updated 2 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- A unified and extensible pipeline for deep learning model inference with C++. Now support yolov8, yolov9, clip, and nanosam. More models …☆12Aug 3, 2025Updated 9 months ago
- Sparse kernels for GNNs based on TVM☆17Nov 18, 2020Updated 5 years ago
- ☆14Oct 2, 2023Updated 2 years ago
- A light llama-like llm inference framework based on the triton kernel.☆184Jan 5, 2026Updated 4 months ago
- 免费的计算机编程类中文书籍,欢迎投稿☆15Aug 13, 2015Updated 10 years ago
- HALO: Hadamard-Assisted Low-Precision Optimization and Training method for finetuning LLMs. 🚀 The official implementation of https://arx …☆28Feb 17, 2025Updated last year
- ☆13Sep 5, 2024Updated last year