01-ai / DescartesLinks
☆108Updated last year
Alternatives and similar repositories for Descartes
Users that are interested in Descartes are comparing it to the libraries listed below
Sorting:
- DashInfer is a native LLM inference engine aiming to deliver industry-leading performance atop various hardware architectures, including …☆253Updated this week
- Inferflow is an efficient and highly configurable inference engine for large language models (LLMs).☆242Updated last year
- Byzer-retrieval is a distributed retrieval system which designed as a backend for LLM RAG (Retrieval Augmented Generation). The system su…☆48Updated 2 months ago
- llm-inference is a platform for publishing and managing llm inference, providing a wide range of out-of-the-box features for model deploy…☆80Updated last year
- Puck is a high-performance ANN search engine☆354Updated this week
- ☆29Updated 9 months ago
- ☆32Updated last year
- A demo built on Megrez-3B-Instruct, integrating a web search tool to enhance the model's question-and-answer capabilities.☆38Updated 5 months ago
- Easy, fast, and cheap pretrain,finetune, serving for everyone☆304Updated this week
- vLLM Documentation in Chinese Simplified / vLLM 中文文档☆71Updated 2 weeks ago
- A high-throughput and memory-efficient inference and serving engine for LLMs☆136Updated 5 months ago
- Efficient AI Inference & Serving☆469Updated last year
- Mixture-of-Experts (MoE) Language Model☆188Updated 8 months ago
- LLM Inference benchmark☆419Updated 10 months ago
- Qwen GRPO Graph Extraction RL Finetune☆49Updated last month
- Transformer framework for edge computing based on C++.☆124Updated 6 months ago
- Akcio is a demonstration project for Retrieval Augmented Generation (RAG). It leverages the power of LLM to generate responses and uses v…☆255Updated last year
- Imitate OpenAI with Local Models☆87Updated 9 months ago
- 中文原生检索增强生成测评基准☆117Updated last year
- AI Native database for embedding vectors☆172Updated 6 months ago
- xllamacpp - a Python wrapper of llama.cpp☆36Updated last week
- Its an open source LLM based on MOE Structure.☆58Updated 10 months ago
- Modular and structured prompt caching for low-latency LLM inference☆94Updated 6 months ago
- [ACL2025 demo track] ROGRAG: A Robustly Optimized GraphRAG Framework☆134Updated 3 weeks ago
- Compare different hardware platforms via the Roofline Model for LLM inference tasks.☆100Updated last year
- XVERSE-MoE-A4.2B: A multilingual large language model developed by XVERSE Technology Inc.☆38Updated last year
- GLM Series Edge Models☆139Updated 3 months ago
- ☆105Updated last year
- A Toolkit for Running On-device Large Language Models (LLMs) in APP☆72Updated 10 months ago
- Transformer related optimization, including BERT, GPT☆39Updated 2 years ago