OpenCSGs / llm-inferenceLinks
llm-inference is a platform for publishing and managing llm inference, providing a wide range of out-of-the-box features for model deployment, such as UI, RESTful API, auto-scaling, computing resource management, monitoring, and more.
☆91Updated last year
Alternatives and similar repositories for llm-inference
Users that are interested in llm-inference are comparing it to the libraries listed below
Sorting:
- The framework of training large language models,support lora, full parameters fine tune etc, define yaml to start training/fine tune of y…☆31Updated last year
- ☆114Updated last year
- 配合 HAI Platform 使用的集成化用户界面☆53Updated 2 years ago
- bisheng-unstructured library☆57Updated 8 months ago
- 国产加速卡-海光DCU实战(大模型训练、微调、推理 等)☆66Updated 5 months ago
- DashInfer is a native LLM inference engine aiming to deliver industry-leading performance atop various hardware architectures, including …☆274Updated 5 months ago
- Easy, fast, and cheap pretrain,finetune, serving for everyone☆315Updated 6 months ago
- This repository provides installation scripts and configuration files for deploying the CSGHub instance, includes Helm charts and Docker…☆18Updated this week
- Inferflow is an efficient and highly configurable inference engine for large language models (LLMs).☆251Updated last year
- XVERSE-MoE-A4.2B: A multilingual large language model developed by XVERSE Technology Inc.☆39Updated last year
- Delta-CoMe can achieve near loss-less 1-bit compressin which has been accepted by NeurIPS 2024☆58Updated last year
- Mixture-of-Experts (MoE) Language Model☆195Updated last year
- Efficient AI Inference & Serving☆480Updated 2 years ago
- GLM Series Edge Models☆157Updated 7 months ago
- LLM 推理服务性能测试☆44Updated 2 years ago
- Pretrain, finetune and serve LLMs on Intel platforms with Ray☆131Updated 4 months ago
- [ACL2025 demo track] ROGRAG: A Robustly Optimized GraphRAG Framework☆194Updated last month
- fastertransformer for codegeex model☆65Updated 2 years ago
- Byzer-retrieval is a distributed retrieval system which designed as a backend for LLM RAG (Retrieval Augmented Generation). The system su…☆49Updated 10 months ago
- Its an open source LLM based on MOE Structure.☆58Updated last year
- xllamacpp - a Python wrapper of llama.cpp☆72Updated this week
- ☆32Updated last year
- LLM scheduler user interface☆21Updated last year
- Tablestore for Agent Memory☆45Updated last month
- Akcio is a demonstration project for Retrieval Augmented Generation (RAG). It leverages the power of LLM to generate responses and uses v…☆260Updated 2 years ago
- ☆29Updated last year
- CPM.cu is a lightweight, high-performance CUDA implementation for LLMs, optimized for end-device inference and featuring cutting-edge tec…☆223Updated 2 weeks ago
- This is an NVIDIA AI Workbench example project that demonstrates an end-to-end model development workflow using Llamafactory.☆72Updated last year
- A simple service that integrates vLLM with Ray Serve for fast and scalable LLM serving.☆78Updated last year
- 官方transformers源码解析。AI大模型时代,pytorch、transformer是新操作系统,其他都是运行在其上面的软件。☆17Updated 2 years ago