01-ai / DescartesLinks
☆111Updated last year
Alternatives and similar repositories for Descartes
Users that are interested in Descartes are comparing it to the libraries listed below
Sorting:
- Easy, fast, and cheap pretrain,finetune, serving for everyone☆311Updated 2 weeks ago
- llm-inference is a platform for publishing and managing llm inference, providing a wide range of out-of-the-box features for model deploy…☆85Updated last year
- Byzer-retrieval is a distributed retrieval system which designed as a backend for LLM RAG (Retrieval Augmented Generation). The system su…☆48Updated 4 months ago
- bisheng-unstructured library☆54Updated 2 months ago
- Akcio is a demonstration project for Retrieval Augmented Generation (RAG). It leverages the power of LLM to generate responses and uses v…☆256Updated last year
- Mixture-of-Experts (MoE) Language Model☆189Updated 10 months ago
- XVERSE-65B: A multilingual large language model developed by XVERSE Technology Inc.☆139Updated last year
- Its an open source LLM based on MOE Structure.☆58Updated last year
- DashInfer is a native LLM inference engine aiming to deliver industry-leading performance atop various hardware architectures, including …☆262Updated 2 months ago
- Inferflow is an efficient and highly configurable inference engine for large language models (LLMs).☆244Updated last year
- ☆30Updated 11 months ago
- A Toolkit for Running On-device Large Language Models (LLMs) in APP☆77Updated last year
- XVERSE-MoE-A4.2B: A multilingual large language model developed by XVERSE Technology Inc.☆40Updated last year
- ☆32Updated last year
- A demo built on Megrez-3B-Instruct, integrating a web search tool to enhance the model's question-and-answer capabilities.☆38Updated 7 months ago
- [ACL2025 demo track] ROGRAG: A Robustly Optimized GraphRAG Framework☆166Updated last month
- Qwen GRPO Graph Extraction RL Finetune☆51Updated 4 months ago
- ☆105Updated last year
- Efficient AI Inference & Serving☆472Updated last year
- Puck is a high-performance ANN search engine☆362Updated 2 months ago
- vLLM Documentation in Chinese Simplified / vLLM 中文文档☆88Updated 2 months ago
- AGI模块库架构图☆76Updated last year
- A high-throughput and memory-efficient inference and serving engine for LLMs☆136Updated 7 months ago
- xllamacpp - a Python wrapper of llama.cpp☆48Updated last week
- 360zhinao☆290Updated 2 months ago
- The official codes for "Aurora: Activating chinese chat capability for Mixtral-8x7B sparse Mixture-of-Experts through Instruction-Tuning"☆263Updated last year
- LLM Inference benchmark☆424Updated last year
- gpt_server是一个用于生产级部署LLMs、Embedding、Reranker、ASR和TTS的开源框架。☆202Updated last week
- Using Llama-3.1 70b on Groq to create o1-like reasoning chains☆18Updated 10 months ago
- Delta-CoMe can achieve near loss-less 1-bit compressin which has been accepted by NeurIPS 2024☆57Updated 8 months ago