PFCCLab / paddlefx
An experimental project for paddle python IR.
☆15Updated 11 months ago
Related projects ⓘ
Alternatives and complementary repositories for paddlefx
- A Bytecode level Implementation of Symbolic OpCode Translator For PaddlePaddle☆15Updated last year
- Compiler Infrastructure for Neural Networks☆143Updated last year
- ☆22Updated last year
- OneFlow->ONNX☆42Updated last year
- 分层解耦的深度学习推理引擎☆60Updated 2 months ago
- PaddlePaddle Developer Community☆88Updated this week
- Models and examples built with OneFlow☆96Updated 3 weeks ago
- oneflow documentation☆68Updated 4 months ago
- play gemm with tvm☆84Updated last year
- Paddle Automatically Diff Precision Toolkits.☆47Updated 6 months ago
- ☆140Updated 6 months ago
- ☆97Updated 7 months ago
- 使用 cutlass 仓库在 ada 架构上实现 fp8 的 flash attention☆52Updated 3 months ago
- ☆79Updated last year
- ☆136Updated this week
- 使用 cutlass 实现 flash-attention 精简版,具有教学意义☆32Updated 3 months ago
- PaddlePaddle custom device implementaion. (『飞桨』自定义硬件接入实现)☆70Updated this week
- Tutorials of Extending and importing TVM with CMAKE Include dependency.☆10Updated last month
- ☆93Updated 3 years ago
- NART = NART is not A RunTime, a deep learning inference framework.☆38Updated last year
- ☆14Updated 2 years ago
- Tutorials for writing high-performance GPU operators in AI frameworks.☆122Updated last year
- ☆164Updated this week
- ☆70Updated last year
- Summary of some awesome work for optimizing LLM inference☆35Updated this week
- PET: Optimizing Tensor Programs with Partially Equivalent Transformations and Automated Corrections☆114Updated 2 years ago
- Performance of the C++ interface of flash attention and flash attention v2 in large language model (LLM) inference scenarios.☆28Updated 2 months ago
- ☆14Updated this week
- ☆103Updated 7 months ago