OpenCSGs / llm-inference
llm-inference is a platform for publishing and managing llm inference, providing a wide range of out-of-the-box features for model deployment, such as UI, RESTful API, auto-scaling, computing resource management, monitoring, and more.
☆74Updated 8 months ago
Alternatives and similar repositories for llm-inference:
Users that are interested in llm-inference are comparing it to the libraries listed below
- The framework of training large language models,support lora, full parameters fine tune etc, define yaml to start training/fine tune of y…☆24Updated 3 months ago
- ☆105Updated 9 months ago
- bisheng-unstructured library☆40Updated last month
- A Toolkit for Running On-device Large Language Models (LLMs) in APP☆58Updated 6 months ago
- DashInfer is a native LLM inference engine aiming to deliver industry-leading performance atop various hardware architectures, including …☆208Updated this week
- 部署你自己的OpenAI api🤩, 基于flask, transformers (使用 Baichuan2-13B-Chat-4bits 模型, 可以运行在单张Tesla T4显卡) ,实现了OpenAI中Chat, Models和Completions接口,包含流式响…☆88Updated last year
- The CSGHub SDK is a powerful Python client specifically designed to interact seamlessly with the CSGHub server. This toolkit is engineere…☆13Updated 2 weeks ago
- Its an open source LLM based on MOE Structure.☆57Updated 6 months ago
- Byzer-retrieval is a distributed retrieval system which designed as a backend for LLM RAG (Retrieval Augmented Generation). The system su…☆45Updated this week
- Easy, fast, and cheap pretrain,finetune, serving for everyone☆272Updated 2 weeks ago
- AGI模块库架构图☆75Updated last year
- GLM Series Edge Models☆123Updated 2 weeks ago
- LLM scheduler user interface☆14Updated 8 months ago
- ☆36Updated 3 months ago
- Implement OpenAI APIs and plugin-enabled ChatGPT with open source LLM and other models.☆121Updated 7 months ago
- Imitate OpenAI with Local Models☆85Updated 4 months ago
- A high-throughput and memory-efficient inference and serving engine for LLMs☆127Updated last month
- Skywork-MoE: A Deep Dive into Training Techniques for Mixture-of-Experts Language Models☆127Updated 7 months ago
- An easy-to-use framework for modular RAG☆307Updated this week
- OpenLLaMA-Chinese, a permissively licensed open source instruction-following models based on OpenLLaMA☆65Updated last year
- agentcraft 可以帮助您快速构建各类应用 场景的ai agent应用☆52Updated 3 weeks ago
- run chatglm3-6b in BM1684X☆38Updated 10 months ago
- Efficient AI Inference & Serving☆462Updated last year
- Inferflow is an efficient and highly configurable inference engine for large language models (LLMs).☆236Updated 10 months ago
- SUS-Chat: Instruction tuning done right☆48Updated last year
- ☆105Updated last year
- Delta-CoMe can achieve near loss-less 1-bit compressin which has been accepted by NeurIPS 2024☆52Updated 2 months ago
- run ChatGLM2-6B in BM1684X☆49Updated 10 months ago
- bisheng model services backend☆27Updated 5 months ago
- Built on the robust XTuner backend framework, XTuner Chat GUI offers a user-friendly platform for quick and efficient local model inferen…☆13Updated 11 months ago