OpenCSGs / llm-inferenceLinks
llm-inference is a platform for publishing and managing llm inference, providing a wide range of out-of-the-box features for model deployment, such as UI, RESTful API, auto-scaling, computing resource management, monitoring, and more.
☆82Updated last year
Alternatives and similar repositories for llm-inference
Users that are interested in llm-inference are comparing it to the libraries listed below
Sorting:
- The framework of training large language models,support lora, full parameters fine tune etc, define yaml to start training/fine tune of y…☆28Updated 9 months ago
- bisheng-unstructured library☆51Updated last month
- ☆109Updated last year
- DashInfer is a native LLM inference engine aiming to deliver industry-leading performance atop various hardware architectures, including …☆256Updated 3 weeks ago
- vLLM Documentation in Chinese Simplified / vLLM 中文文档☆79Updated last month
- This repository provides installation scripts and configuration files for deploying the CSGHub instance, includes Helm charts and Docker…☆16Updated this week
- Easy, fast, and cheap pretrain,finetune, serving for everyone☆306Updated this week
- The CSGHub SDK is a powerful Python client specifically designed to interact seamlessly with the CSGHub server. This toolkit is engineere…☆17Updated this week
- bisheng model services backend☆29Updated 11 months ago
- AGI模块库架构图☆75Updated last year
- A Toolkit for Running On-device Large Language Models (LLMs) in APP☆73Updated 11 months ago
- 配合 HAI Platform 使用的集成化用户界面☆52Updated 2 years ago
- XVERSE-MoE-A4.2B: A multilingual large language model developed by XVERSE Technology Inc.☆39Updated last year
- Mixture-of-Experts (MoE) Language Model☆189Updated 9 months ago
- xllamacpp - a Python wrapper of llama.cpp☆43Updated last week
- Byzer-retrieval is a distributed retrieval system which designed as a backend for LLM RAG (Retrieval Augmented Generation). The system su…☆48Updated 3 months ago
- LLM scheduler user interface☆15Updated last year
- ☆168Updated this week
- XVERSE-65B: A multilingual large language model developed by XVERSE Technology Inc.☆139Updated last year
- A high-throughput and memory-efficient inference and serving engine for LLMs☆135Updated 6 months ago
- GLM Series Edge Models☆142Updated last week
- Skywork-MoE: A Deep Dive into Training Techniques for Mixture-of-Experts Language Models☆133Updated last year
- LLM 推理服务性能测试☆41Updated last year
- ☆50Updated last week
- SUS-Chat: Instruction tuning done right☆48Updated last year
- ☆32Updated last year
- A demo built on Megrez-3B-Instruct, integrating a web search tool to enhance the model's question-and-answer capabilities.☆38Updated 6 months ago
- Inferflow is an efficient and highly configurable inference engine for large language models (LLMs).☆242Updated last year
- zero零训练llm调参☆31Updated last year
- Imitate OpenAI with Local Models☆87Updated 9 months ago