guqiong96 / LvllmLinks

LvLLM is a special NUMA extension of vllm that makes full use of CPU and memory resources, reduces GPU memory requirements, and features an efficient GPU parallel and NUMA parallel architecture, supporting hybrid inference for MOE large models.

☆78

Alternatives and similar repositories for Lvllm

Users that are interested in Lvllm are comparing it to the libraries listed below

Sorting:

zRzRzRzRzRzRzR / GLM-4
GLM-4 series: Open Multilingual Multimodal Chat LMs | 开源多语言多模态对话模型
☆27Updated 5 months ago
hkgood / Ollama_ChatTTS
LLM voice chat project by Connect ChatTTS with Local Ollama, 连接本地部署的 Ollama 和 ChatTTS，实现和LLM的语音对话
☆65Updated last year
thad0ctor / llama-server-launcher
☆107Updated 2 months ago
and270 / thinking_effort_processor
☆94Updated 4 months ago
ztxz16 / exvllm
vllm混合推理扩展插件，支持多NUMA混合推理，单卡推理Qwen3-Next模型可达1000+ prefill
☆26Updated 3 weeks ago
zlai-llm / zlai
zlai
☆22Updated last year
srszzw / stagehand-glm
Stagehand-GLM 是基于 stagehand-python 深度定制的AI浏览器自动化框架，专门适配了智谱AI的GLM文本和多模态大模型。它提供了渐进式的RPA操作策略，让开发者在智能便捷和成本效益之间找到最佳平衡点。
☆23Updated 3 months ago
Ikaros-521 / GraphRAG-Ollama-UI
GraphRAG-Ollama-UI + GraphRAG4OpenWebUI 融合版（有gradio webui配置生成RAG索引，有fastapi提供RAG API服务）
☆104Updated last year
jianchang512 / fireredasr-ui
一个中文语音转文字项目，封装自FireRedASR
☆82Updated 9 months ago
gpustack / llama-box
LM inference server implementation based on *.cpp.
☆292Updated last week
xorbitsai / xllamacpp
xllamacpp - a Python wrapper of llama.cpp
☆66Updated last week
gpustack / vox-box
A text-to-speech and speech-to-text server compatible with the OpenAI API, supporting Whisper, FunASR, Bark, and CosyVoice backends.
☆177Updated 4 months ago
IEIT-Yuan / YuanChat
☆46Updated 9 months ago
weidwonder / crawl4ai-mcp-server
用于提供给本地开发者的 LLM的高效互联网搜索&内容获取的MCP Server，节省你的token
☆121Updated 6 months ago
aixiaoxin123 / mcp_demo_project
mcp的webui界面，支持客户端连接多个sse服务端，支持 openai、deepseek、qwen等大模型，另外附上构建的 agent的 stdio和sse的简单天气查询的完整示例
☆36Updated 6 months ago
chuanruihu / Level-Navi-Agent-Search
The Level-Navi Agent, a framework that requires no training and utilizes large language models for deep query understanding and precise s…
☆81Updated 11 months ago
KylinMountain / markify
Convert files into markdown to help RAG or LLM understand, based on markitdown and MinerU, which could provide high quality pdf parser.
☆131Updated 8 months ago
LeslieLeung / dify-connector
dify-connector is a tool to publish Dify apps to various IM platforms. | dify-connector 是一个将 Dify 发布到各种 IM 平台的工具。
☆100Updated last year
mutalisk999 / deepsite
mirror of https://huggingface.co/spaces/enzostvs/deepsite
☆76Updated 8 months ago
ubergarm / r1-ktransformers-guide
run DeepSeek-R1 GGUFs on KTransformers
☆256Updated 8 months ago
femto / minion-agent
A simple agent framework that's capable of browser use + mcp + auto instrument + plan + deep research + more
☆330Updated last month
evangelosmeklis / deepdrone
An AI agent to control drones from your CLI
☆138Updated 3 months ago
lanesky / livekityoutubedemo
☆29Updated last year
gen-cli / gen-cli
Agents of C.L.I.
☆141Updated 2 months ago
3dify-project / dify-mcp-client
MCP Client as an Agent Strategy Plugin. Support GUI operation via UI-TARS-SDK.
☆158Updated 4 months ago
XJF2332 / GOT-OCR-2-GUI
GOT-OCR的GUI版本，提供OCR、导出PDF、批处理等功能，但不提供训练功能
☆182Updated 2 weeks ago
SystemPanic / vllm-windows
A high-throughput and memory-efficient inference and serving engine for LLMs (Windows build & kernels)
☆243Updated this week
crazywoola / dify-tools-worker
A function calling tool can be deployed to Cloudflare Workers with openapi schema
☆101Updated last year
codai-agent / codai
Codai is an AI programming tool that boosts coding efficiency and empowers non-programmers. Its future plans include introducing a local …
☆24Updated 2 months ago
dubeno / NotebookLLM-Chinese
Cross Platform Open Sourced Chinese NoteBookLM app based on Electron, Use DeepSeek + Reecho.ai
☆80Updated last year