OpenCSGs / llm-inferenceLinks
llm-inference is a platform for publishing and managing llm inference, providing a wide range of out-of-the-box features for model deployment, such as UI, RESTful API, auto-scaling, computing resource management, monitoring, and more.
☆85Updated last year
Alternatives and similar repositories for llm-inference
Users that are interested in llm-inference are comparing it to the libraries listed below
Sorting:
- The framework of training large language models,support lora, full parameters fine tune etc, define yaml to start training/fine tune of y…☆28Updated 10 months ago
- bisheng-unstructured library☆54Updated 2 months ago
- ☆111Updated last year
- AGI模块库架构图☆76Updated last year
- Easy, fast, and cheap pretrain,finetune, serving for everyone☆311Updated 2 weeks ago
- 配合 HAI Platform 使用的集成化用户界面☆52Updated 2 years ago
- This repository provides installation scripts and configuration files for deploying the CSGHub instance, includes Helm charts and Docker…☆16Updated this week
- Inferflow is an efficient and highly configurable inference engine for large language models (LLMs).☆244Updated last year
- DashInfer is a native LLM inference engine aiming to deliver industry-leading performance atop various hardware architectures, including …☆262Updated 2 months ago
- [ACL2025 demo track] ROGRAG: A Robustly Optimized GraphRAG Framework☆166Updated last month
- A Toolkit for Running On-device Large Language Models (LLMs) in APP☆77Updated last year
- Index of the CodeFuse Repositories☆138Updated 11 months ago
- ☆30Updated 11 months ago
- Efficient AI Inference & Serving☆472Updated last year
- XVERSE-MoE-A4.2B: A multilingual large language model developed by XVERSE Technology Inc.☆40Updated last year
- Delta-CoMe can achieve near loss-less 1-bit compressin which has been accepted by NeurIPS 2024☆57Updated 8 months ago
- A benchmarking tool for comparing different LLM API providers' DeepSeek model deployments.☆28Updated 4 months ago
- ☆32Updated last year
- Akcio is a demonstration project for Retrieval Augmented Generation (RAG). It leverages the power of LLM to generate responses and uses v…☆256Updated last year
- The tool is used for building and driving workflows specifically tailored for AI initiatives. It can be used to construct AI agents.☆149Updated last year
- CPM.cu is a lightweight, high-performance CUDA implementation for LLMs, optimized for end-device inference and featuring cutting-edge tec…☆163Updated 3 weeks ago
- bisheng model services backend☆30Updated last year
- A open version Manus.☆61Updated 4 months ago
- High-performance LLM inference based on our optimized version of FastTransfomer☆123Updated last year
- Mixture-of-Experts (MoE) Language Model☆189Updated 10 months ago
- ☆170Updated this week
- This is an NVIDIA AI Workbench example project that demonstrates an end-to-end model development workflow using Llamafactory.☆62Updated 9 months ago
- zero零训练llm调参☆31Updated 2 years ago
- An open platform for enhancing the capability of LLMs in workflow orchestration.☆159Updated 4 months ago
- LLM Group Chat Framework: chat with multiple LLMs at the same time. 大模型群聊框架:同时与多个大语言模型聊天。☆313Updated last month