kyuz0 / amd-strix-halo-vllm-toolboxesLinks
☆48Updated 2 months ago
Alternatives and similar repositories for amd-strix-halo-vllm-toolboxes
Users that are interested in amd-strix-halo-vllm-toolboxes are comparing it to the libraries listed below
Sorting:
- LLM Fine Tuning Toolbox images for Ryzen AI 395+ Strix Halo☆34Updated 2 months ago
- Ampere optimized llama.cpp☆28Updated last month
- How to build an ACP compliant agent that uses MCP as well!☆11Updated 6 months ago
- An NVIDIA AI Workbench example project for fine-tuning a Mistral 7B model☆66Updated last year
- Fully-featured, beautiful web interface for vLLM - built with NextJS.☆161Updated 6 months ago
- The easiest & fastest way to run LLMs in your home lab☆71Updated 3 months ago
- Inference engine for Intel devices. Serve LLMs, VLMs, Whisper, Kokoro-TTS, Embedding and Rerank models over OpenAI endpoints.☆247Updated 3 weeks ago
- Offline LLM chatbot with personalized memory — works on CPU with multi-session memory support.☆22Updated 5 months ago
- InferX: Inference as a Service Platform☆139Updated this week
- GPU Power and Performance Manager☆61Updated last year
- ✅ Iterative Transparent Reasoning System by chonkyDB ✅ combining reasoning, graph and vector for trustworthy, explainable and smart LLMs …☆35Updated 5 months ago
- Use smol agents to do research and then update csv coumns with its findings.☆41Updated 9 months ago
- No-code CLI designed for accelerating ONNX workflows☆216Updated 5 months ago
- ☆79Updated last month
- ☆146Updated 3 weeks ago
- vLLM: A high-throughput and memory-efficient inference and serving engine for LLMs☆93Updated this week
- Intel® AI Assistant Builder☆128Updated this week
- A comprehensive platform for managing, testing, and leveraging Ollama AI models with advanced features for customization, workflow automa…☆47Updated 8 months ago
- Building open version of OpenAI o1 via reasoning traces (Groq, ollama, Anthropic, Gemini, OpenAI, Azure supported) Demo: https://hugging…☆187Updated last year
- A platform to self-host AI on easy mode☆177Updated last week
- ☆176Updated 3 months ago
- Route LLM requests to the best model for the task at hand.☆133Updated last week
- Sparse Inferencing for transformer based LLMs☆213Updated 3 months ago
- Open Deep Researcher with openai compatible endpoint, now completely local with ollama, local playwright via searxng with citations and p…☆146Updated 8 months ago
- This is the Mixture-of-Agents (MoA) concept, adapted from the original work by TogetherAI. My version is tailored for local model usage a…☆118Updated last year
- Run multiple resource-heavy Large Models (LM) on the same machine with limited amount of VRAM/other resources by exposing them on differe …☆84Updated last month
- Multi-agent autonomous research system using LangGraph and LangChain. Generates citation-backed reports with credibility scoring and web …☆65Updated last week
- EmbeddedLLM: API server for Embedded Device Deployment. Currently support CUDA/OpenVINO/IpexLLM/DirectML/CPU☆43Updated last year
- ☆223Updated last month
- For individual users, watsonx Code Assistant can access a local IBM Granite model☆37Updated 5 months ago