OpenCSGs / llm-inference
llm-inference is a platform for publishing and managing llm inference, providing a wide range of out-of-the-box features for model deployment, such as UI, RESTful API, auto-scaling, computing resource management, monitoring, and more.
☆80Updated 11 months ago
Alternatives and similar repositories for llm-inference:
Users that are interested in llm-inference are comparing it to the libraries listed below
- The framework of training large language models,support lora, full parameters fine tune etc, define yaml to start training/fine tune of y…☆26Updated 7 months ago
- ☆108Updated last year
- bisheng-unstructured library☆43Updated last week
- This repository provides installation scripts and configuration files for deploying the CSGHub instance, includes Helm charts and Docker…☆14Updated this week
- DashInfer is a native LLM inference engine aiming to deliver industry-leading performance atop various hardware architectures, including …☆244Updated last week
- GLM Series Edge Models☆134Updated last month
- The CSGHub SDK is a powerful Python client specifically designed to interact seamlessly with the CSGHub server. This toolkit is engineere…☆14Updated 2 weeks ago
- 配合 HAI Platform 使用的集成化用户界面☆48Updated last year
- LLM scheduler user interface☆16Updated 11 months ago
- bisheng model services backend☆27Updated 8 months ago
- AGI模块库架构图☆75Updated last year
- Easy, fast, and cheap pretrain,finetune, serving for everyone☆293Updated last week
- Imitate OpenAI with Local Models☆88Updated 7 months ago
- Convert files into markdown to help RAG or LLM understand, based on markitdown and MinerU, which could provide high quality pdf parser.☆82Updated 3 weeks ago
- A Toolkit for Running On-device Large Language Models (LLMs) in APP☆72Updated 9 months ago
- Akcio is a demonstration project for Retrieval Augmented Generation (RAG). It leverages the power of LLM to generate responses and uses v…☆255Updated last year
- This is an NVIDIA AI Workbench example project that demonstrates an end-to-end model development workflow using Llamafactory.☆54Updated 6 months ago
- Multi-Agents & Plugins repo for DB-GPT, Can complete various tasks around databases.☆99Updated last year
- A high-throughput and memory-efficient inference and serving engine for LLMs☆134Updated 4 months ago
- ☆44Updated last year
- AutoHub: A Personal Browser Automation Assistant☆16Updated this week
- vLLM Documentation in Chinese Simplified / vLLM 中文文档☆57Updated this week
- ROGRAG: A Robustly Optimized GraphRAG Framework☆110Updated last week
- ☆29Updated 7 months ago
- ☆32Updated last year
- Byzer-retrieval is a distributed retrieval system which designed as a backend for LLM RAG (Retrieval Augmented Generation). The system su…☆48Updated last month
- A benchmarking tool for comparing different LLM API providers' DeepSeek model deployments.☆29Updated 3 weeks ago
- fastertransformer for codegeex model☆63Updated last year
- An easy-to-use framework for modular RAG☆349Updated this week
- Delta-CoMe can achieve near loss-less 1-bit compressin which has been accepted by NeurIPS 2024☆57Updated 5 months ago