FlyAIBox / dcu-in-actionLinks
国产加速卡-海光DCU实战(大模型训练、微调、推理 等)
☆64Updated 5 months ago
Alternatives and similar repositories for dcu-in-action
Users that are interested in dcu-in-action are comparing it to the libraries listed below
Sorting:
- llm-inference is a platform for publishing and managing llm inference, providing a wide range of out-of-the-box features for model deploy…☆90Updated last year
- Efficient, Flexible, and Highly Fault-Tolerant Model Service Management Based on SGLang☆61Updated last year
- Manages vllm-nccl dependency☆17Updated last year
- LLM 推理服务性能测试☆44Updated 2 years ago
- 配合 HAI Platform 使用的集成化用户界面☆53Updated 2 years ago
- vLLM Router☆54Updated last year
- CPM.cu is a lightweight, high-performance CUDA implementation for LLMs, optimized for end-device inference and featuring cutting-edge tec…☆214Updated 3 months ago
- vLLM Documentation in Chinese Simplified / vLLM 中文文档☆143Updated 3 weeks ago
- GLM Series Edge Models☆156Updated 6 months ago
- ☆71Updated last week
- Inferflow is an efficient and highly configurable inference engine for large language models (LLMs).☆250Updated last year
- Delta-CoMe can achieve near loss-less 1-bit compressin which has been accepted by NeurIPS 2024☆59Updated last year
- DashInfer is a native LLM inference engine aiming to deliver industry-leading performance atop various hardware architectures, including …☆272Updated 5 months ago
- [ACL2025 demo track] ROGRAG: A Robustly Optimized GraphRAG Framework☆189Updated 3 weeks ago
- run ChatGLM2-6B in BM1684X☆49Updated last year
- ☆79Updated 2 years ago
- ☆181Updated this week
- ☆114Updated last year
- The driver for LMCache core to run in vLLM☆59Updated 11 months ago
- ☆56Updated last year
- ☆29Updated last year
- LLM101n: Let's build a Storyteller 中文版☆138Updated last year
- Transformer related optimization, including BERT, GPT☆17Updated 2 years ago
- A minimalist benchmarking tool designed to test the routine-generation capabilities of LLMs.☆27Updated last year
- ☆25Updated 2 years ago
- A high-throughput and memory-efficient inference and serving engine for LLMs☆39Updated 4 months ago
- A benchmarking tool for comparing different LLM API providers' DeepSeek model deployments.☆30Updated 9 months ago
- NVIDIA TensorRT Hackathon 2023复赛选题:通义千问Qwen-7B用TensorRT-LLM模型搭建及优化☆43Updated 2 years ago
- Compare different hardware platforms via the Roofline Model for LLM inference tasks.☆120Updated last year
- ☆387Updated this week