hcd233 / Aris-AI-Model-ServerLinks
An OpenAI Compatible API which integrates LLM, Embedding and Reranker. 一个集成 LLM、Embedding 和 Reranker 的 OpenAI 兼容 API
☆17Updated 3 months ago
Alternatives and similar repositories for Aris-AI-Model-Server
Users that are interested in Aris-AI-Model-Server are comparing it to the libraries listed below
Sorting:
- Open Source Text Embedding Models with OpenAI Compatible API☆163Updated last year
- xllamacpp - a Python wrapper of llama.cpp☆66Updated last week
- A text-to-speech and speech-to-text server compatible with the OpenAI API, supporting Whisper, FunASR, Bark, and CosyVoice backends.☆177Updated 4 months ago
- Lightweight continuous batching OpenAI compatibility using HuggingFace Transformers include T5 and Whisper.☆29Updated 8 months ago
- LLM based agents with proactive interactions, long-term memory, external tool integration, and local deployment capabilities.☆106Updated 4 months ago
- 大模型推理框架加速,让 LLM 飞起 来☆22Updated last year
- Deployment a light and full OpenAI API for production with vLLM to support /v1/embeddings with all embeddings models.☆44Updated last year
- Get up and running with Llama 3, Mistral, Gemma, and other large language models.☆30Updated 2 weeks ago
- Forces DeepSeek R1 models to engage in extended reasoning by intercepting early termination tokens.☆19Updated 9 months ago
- LM inference server implementation based on *.cpp.☆292Updated last week
- TextEmbed is a REST API crafted for high-throughput and low-latency embedding inference. It accommodates a wide variety of embedding mode…☆26Updated last year
- Sentence Transformers API: An OpenAI compatible embedding API server☆68Updated last year
- You can play any API server that compatible with OpenAI API☆24Updated last year
- 通过该项目将Dify通过Pipeline接入OpenwebUI,可以兼并OpenwebUI的前端优势和相应生态以及Dify强大的模型可拓展性和Workflow的效益。☆38Updated last year
- MinerU API server☆82Updated 11 months ago
- Review/Check GGUF files and estimate the memory usage and maximum tokens per second.☆216Updated 3 months ago
- A high-throughput and memory-efficient inference and serving engine for LLMs☆132Updated last year
- A library integrating embedding and reranker models from OpenAI, SentenceTransformers etc for semantic search in vector database.☆57Updated 8 months ago
- instinct.cpp provides ready to use alternatives to OpenAI Assistant API and built-in utilities for developing AI Agent applications (RAG,…☆54Updated last year
- ☆20Updated 4 months ago
- Easy to deploy.A cloud service for python code interpreter sandbox for Code-Interpreter.☆56Updated last year
- Library for model distillation☆158Updated 2 months ago
- mcp-difyworkflow-server is an mcp server Tools application that implements the query and invocation of Dify workflows, supporting the on-…☆59Updated 11 months ago
- 简单的 AIGC 微服务,可通过 HTTP、gRPC 连接,支持流式回答。☆10Updated 2 years ago
- cli tool to quantize gguf, gptq, awq, hqq and exl2 models☆76Updated 11 months ago
- vllm混合推理扩展插件,支持多NUMA混合推理,单卡推理Qwen3-Next模型可达1000+ prefill☆26Updated 3 weeks ago
- This is an NVIDIA AI Workbench example project that demonstrates an end-to-end model development workflow using Llamafactory.☆68Updated last year
- LLM inference in C/C++☆103Updated 3 weeks ago
- Jina DeepSearch UI☆126Updated 3 months ago
- A set of tools to create synthetically-generated data from documents☆37Updated 3 months ago