HFAiLab / hai-platform
一种任务级GPU算力分时调度的高性能深度学习训练平台
☆583Updated last year
Alternatives and similar repositories for hai-platform:
Users that are interested in hai-platform are comparing it to the libraries listed below
- The official repo of Pai-Megatron-Patch for LLM & VLM large scale training developed by Alibaba Cloud.☆925Updated this week
- RTP-LLM: Alibaba's high-performance LLM inference engine for diverse applications.☆656Updated last month
- FlagScale is a large model toolkit based on open-sourced projects.☆246Updated this week
- ☆319Updated last month
- GLake: optimizing GPU memory management and IO transmission.☆437Updated 3 months ago
- Easy Parallel Library (EPL) is a general and efficient deep learning framework for distributed model training.☆266Updated last year
- DashInfer is a native LLM inference engine aiming to deliver industry-leading performance atop various hardware architectures, including …☆237Updated this week
- A streamlined and customizable framework for efficient large model evaluation and performance benchmarking☆574Updated this week
- Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.☆2,800Updated this week
- A fast communication-overlapping library for tensor/expert parallelism on GPUs.☆509Updated this week
- DLRover: An Automatic Distributed Deep Learning System☆1,363Updated this week
- A flexible and efficient training framework for large-scale alignment tasks☆322Updated 3 weeks ago
- 配合 HAI Platform 使用的集成化用户界面☆43Updated last year
- Best practice for training LLaMA models in Megatron-LM☆644Updated last year
- ☆214Updated last year
- A PyTorch Native LLM Training Framework☆748Updated 2 months ago
- FireFlyer Record file format, writer and reader for DL training samples.☆198Updated 2 years ago
- LLM Inference benchmark☆401Updated 7 months ago
- ☆310Updated 9 months ago
- Easy, fast, and cheap pretrain,finetune, serving for everyone☆288Updated this week
- ☆156Updated this week
- optimized BERT transformer inference on NVIDIA GPU. https://arxiv.org/abs/2210.03052☆469Updated 11 months ago
- Disaggregated serving system for Large Language Models (LLMs).☆491Updated 6 months ago
- InternEvo is an open-sourced lightweight training framework aims to support model pre-training without the need for extensive dependencie…☆363Updated this week
- FlagPerf is an open-source software platform for benchmarking AI chips.☆324Updated last month
- HFAI deep learning models☆142Updated last year
- ☆273Updated last year
- Efficient and easy multi-instance LLM serving☆326Updated this week
- kubeflow国内一键安装文件☆344Updated 2 years ago