PFCCLab / StarterLinks

【HACKATHON 预备营】飞桨启航计划集训营

☆17

Alternatives and similar repositories for Starter

Users that are interested in Starter are comparing it to the libraries listed below

Sorting:

PFCCLab / Camp
飞桨护航计划集训营
☆20Updated last month
hyperai / triton-cn
Triton Documentation in Chinese Simplified / Triton 中文文档
☆94Updated 2 weeks ago
ZonePG / cs-notes
my cs notes
☆55Updated last year
harleyszhang / lite_llama
A light llama-like llm inference framework based on the triton kernel.
☆166Updated 2 months ago
dlsyscourse / hw2
☆10Updated 2 months ago
InfiniTensor / InfiniTensor
☆274Updated last month
PaddlePaddle / community
PaddlePaddle Developer Community
☆127Updated last week
xlite-dev / ffpa-attn
🤖FFPA: Extend FlashAttention-2 with Split-D, ~O(1) SRAM complexity for large headdim, 1.8x~3x↑🎉 vs SDPA EA.
☆233Updated 2 weeks ago
caiwanxianhust / FasterLLaMA
使用 CUDA C++ 实现的 llama 模型推理框架
☆62Updated last year
xgqdut2016 / cuda_code
easy cuda code
☆90Updated 11 months ago
YuxueYang1204 / CudaDemo
Implement custom operators in PyTorch with cuda/c++
☆74Updated 2 years ago
InfiniTensor / RefactorGraph
分层解耦的深度学习推理引擎
☆76Updated 9 months ago
openmlsys / openmlsys-cuda
Tutorials for writing high-performance GPU operators in AI frameworks.
☆133Updated 2 years ago
harleyszhang / llm_counts
llm theoretical performance analysis tools and support params, flops, memory and latency analysis.
☆113Updated 4 months ago
zjhellofss / triton_course
☆39Updated 6 months ago
li199603 / parallel_prefix_sum
Parallel Prefix Sum (Scan) with CUDA
☆27Updated last year
xgqdut2016 / hpc_project
some hpc project for learning
☆25Updated last year
zjhellofss / kuiperdatawhale
☆302Updated last year
xgqdut2016 / hpc2torch
☆29Updated last month
InfiniTensor / InfiniLM-Rust
☆125Updated last month
LDLINGLINGLING / nano_vllm_note
注释的nano_vllm仓库，并且完成了MiniCPM4的适配以及注册新模型的功能
☆108Updated 3 months ago
zjhellofss / KuiperCourse
b站上的课程
☆79Updated 2 years ago
AyakaGEMM / Hands-on-GEMM
☆144Updated last year
toyaix / TritonLLM
LLM Inference via Triton (Flexible & Modular): Focused on Kernel Optimization using CUBIN binaries, Starting from gpt-oss Model
☆56Updated last month
PaddlePaddle / PaConvert
PaddlePaddle Code Convert Toolkit. 『飞桨』深度学习代码转换工具
☆119Updated this week
PaddlePaddle / PaddleCustomDevice
PaddlePaddle custom device implementaion. (『飞桨』自定义硬件接入实现)
☆100Updated this week
gogongxt / nano-sglang
☆64Updated last week
interestingLSY / CUDA-From-Correctness-To-Performance-Code
Codes & examples for "CUDA - From Correctness to Performance"
☆117Updated last year
zjhellofss / KuiperLLama
校招、秋招、春招、实习好项目，带你从零动手实现支持LLama2/3和Qwen2.5的大模型推理框架。
☆461Updated last month
ShigureLab / python-lib-starter
Just a template for quickly creating a python library.
☆10Updated last week