OpenCSGs / llm-inferenceLinks
llm-inference is a platform for publishing and managing llm inference, providing a wide range of out-of-the-box features for model deployment, such as UI, RESTful API, auto-scaling, computing resource management, monitoring, and more.
☆80Updated last year
Alternatives and similar repositories for llm-inference
Users that are interested in llm-inference are comparing it to the libraries listed below
Sorting:
- The framework of training large language models,support lora, full parameters fine tune etc, define yaml to start training/fine tune of y…☆28Updated 8 months ago
- This repository provides installation scripts and configuration files for deploying the CSGHub instance, includes Helm charts and Docker…☆16Updated this week
- ☆108Updated last year
- LLM scheduler user interface☆16Updated last year
- The CSGHub SDK is a powerful Python client specifically designed to interact seamlessly with the CSGHub server. This toolkit is engineere…☆14Updated last week
- bisheng-unstructured library☆48Updated last week
- A Toolkit for Running On-device Large Language Models (LLMs) in APP☆72Updated 10 months ago
- 配合 HAI Platform 使用的集成化用户界面☆51Updated 2 years ago
- AGI模块库架构图☆75Updated last year
- ☆32Updated last year
- A high-throughput and memory-efficient inference and serving engine for LLMs☆136Updated 5 months ago
- Dingo: A Comprehensive Data Quality Evaluation Tool☆165Updated this week
- OpenLLaMA-Chinese, a permissively licensed open source instruction-following models based on OpenLLaMA☆66Updated last year
- Inferflow is an efficient and highly configurable inference engine for large language models (LLMs).☆242Updated last year
- bisheng model services backend☆27Updated 10 months ago
- Imitate OpenAI with Local Models☆87Updated 9 months ago
- vLLM Documentation in Chinese Simplified / vLLM 中文文档☆71Updated 2 weeks ago
- Easy, fast, and cheap pretrain,finetune, serving for everyone☆304Updated this week
- Delta-CoMe can achieve near loss-less 1-bit compressin which has been accepted by NeurIPS 2024☆57Updated 6 months ago
- DashInfer is a native LLM inference engine aiming to deliver industry-leading performance atop various hardware architectures, including …☆253Updated this week
- Skywork-MoE: A Deep Dive into Training Techniques for Mixture-of-Experts Language Models☆132Updated 11 months ago
- Efficient, Flexible, and Highly Fault-Tolerant Model Service Management Based on SGLang☆53Updated 6 months ago
- Using Llama-3.1 70b on Groq to create o1-like reasoning chains☆19Updated 8 months ago
- Easy to deploy.A cloud service for python code interpreter sandbox for Code-Interpreter.☆51Updated last year
- ☆112Updated last month
- Qwen DianJin: LLMs for the Financial Industry by Alibaba Cloud☆105Updated last week
- ☆45Updated last year
- Receipts for creating AI Applications with APIs from DashScope (and friends)!☆54Updated 8 months ago
- Bambo is a new proxy framework. Compared with mainstream frameworks, it is more lightweight and flexible and can handle various load task…☆35Updated 3 months ago
- An Innovative Agent Framework Driven by KG Engine☆763Updated 4 months ago