YellowOldOdd / SDBILinks
Simple Dynamic Batching Inference
☆145Updated 3 years ago
Alternatives and similar repositories for SDBI
Users that are interested in SDBI are comparing it to the libraries listed below
Sorting:
- Transformer related optimization, including BERT, GPT☆59Updated last year
- Models and examples built with OneFlow☆97Updated 7 months ago
- ☆139Updated last year
- ☆71Updated 2 years ago
- oneflow documentation☆69Updated 11 months ago
- OneFlow models for benchmarking.☆104Updated 9 months ago
- symmetric int8 gemm☆66Updated 4 years ago
- ☆127Updated 5 months ago
- ☆79Updated last year
- Transformer related optimization, including BERT, GPT☆39Updated 2 years ago
- ☆99Updated 3 years ago
- Running BERT without Padding☆471Updated 3 years ago
- A Fast Muti-processing BERT-Inference System☆101Updated 2 years ago
- ☆23Updated 2 years ago
- Transformer related optimization, including BERT, GPT☆17Updated last year
- ☆96Updated 3 years ago
- DeepLearning Framework Performance Profiling Toolkit☆285Updated 3 years ago
- OneFlow->ONNX☆43Updated 2 years ago
- 动手学习TVM核心原理教程☆61Updated 4 years ago
- ☆93Updated 2 months ago
- optimized BERT transformer inference on NVIDIA GPU. https://arxiv.org/abs/2210.03052☆473Updated last year
- Place for meetup slides☆140Updated 4 years ago
- InsNet Runs Instance-dependent Neural Networks with Padding-free Dynamic Batching.☆66Updated 3 years ago
- 服务侧深度学习部署案例☆451Updated 5 years ago
- ☆148Updated 4 months ago
- Performance of the C++ interface of flash attention and flash attention v2 in large language model (LLM) inference scenarios.☆37Updated 3 months ago
- Serving Inside Pytorch☆160Updated 3 weeks ago
- Trans different platform's network to International Representation(IR)☆44Updated 7 years ago
- ☆127Updated 3 years ago
- Möbius Transformation for Fast Inner Product Search on Graph☆22Updated 4 years ago