EmbeddedLLM / embeddedllmLinks

EmbeddedLLM: API server for Embedded Device Deployment. Currently support CUDA/OpenVINO/IpexLLM/DirectML/CPU

☆40

Alternatives and similar repositories for embeddedllm

Users that are interested in embeddedllm are comparing it to the libraries listed below

Sorting:

bentoml / BentoVLLM
Self-host LLMs with vLLM and BentoML
☆134Updated 2 weeks ago
weaviate / structured-rag
Experimental Code for StructuredRAG: JSON Response Formatting with Large Language Models
☆108Updated 3 months ago
allenai / olmo-cookbook
OLMost every training recipe you need to perform data interventions with the OLMo family of models.
☆36Updated this week
LAION-AI / bud-e
A general human-ai interaction platform.
☆15Updated 6 months ago
substratusai / vllm-docker
☆62Updated 3 months ago
sambanova / agents
☆50Updated this week
AstraBert / PrAIvateSearch
Own your AI, search the web with it🌐😎
☆86Updated 6 months ago
guidance-ai / jsonschemabench
☆47Updated last month
Xalp / ECHO
Official homepage for "Self-Harmonized Chain of Thought" (NAACL 2025)
☆91Updated 5 months ago
facebookresearch / matrix
Matrix (Multi-Agent daTa geneRation Infra and eXperimentation framework) is a versatile engine for multi-agent conversational data genera…
☆73Updated this week
flowaicom / flow-judge
Code for evaluating with Flow-Judge-v0.1 - an open-source, lightweight (3.8B) language model optimized for LLM system evaluations. Crafte…
☆74Updated 8 months ago
miralab-ai / autoreason
☆40Updated 7 months ago
LLMSELECTOR / LLMSELECTOR
☆71Updated 4 months ago
shivamsanju / ragswift
🚀 Scale your RAG pipeline using Ragswift: A scalable centralized embeddings management platform
☆38Updated last year
raphaelmansuy / iteration_of_tought
Example implementation of Iteration of Tought - Gives a star if you like the project
☆42Updated 6 months ago
zhudotexe / redel
ReDel is a toolkit for researchers and developers to build, iterate on, and analyze recursive multi-agent systems. (EMNLP 2024 Demo)
☆82Updated 4 months ago
DS4SD / deepsearch-glm
Create fast graph language models from converted PDF documents for knowledge extraction and Q&A.
☆55Updated 5 months ago
foundation-model-stack / bamba
Train, tune, and infer Bamba model
☆130Updated last month
ArturTanona / grpo_unsloth_docker
☆57Updated 5 months ago
langchain-ai / langchain-elastic
Elasticsearch integration into LangChain
☆57Updated 5 months ago
adithya-s-k / YoloGemma
Testing and evaluating the capabilities of Vision-Language models (PaliGemma) in performing computer vision tasks such as object detectio…
☆81Updated last year
phunterlau / paper_without_code
LLM reads a paper and produce a working prototype
☆58Updated 3 months ago
Cerebras / DocChat
GPT-4 Level Conversational QA Trained In a Few Hours
☆63Updated 10 months ago
jmanhype / dspy-self-discover-framework
Leveraging DSPy for AI-driven task understanding and solution generation, the Self-Discover Framework automates problem-solving through r…
☆63Updated last year
AlexBodner / How_Much_VRAM
☆101Updated 10 months ago
aniketmaurya / fastserve-ai
Machine Learning Serving focused on GenAI with simplicity as the top priority.
☆59Updated last week
EmbeddedLLM / vllm
vLLM: A high-throughput and memory-efficient inference and serving engine for LLMs
☆87Updated this week
langfuse / oss-llmops-stack
Modular, open source LLMOps stack that separates concerns: LiteLLM unifies LLM APIs, manages routing and cost controls, and ensures high-…
☆106Updated 5 months ago
bentoml / BentoCrewAI
Serving CrewAI Agent as REST API with BentoML, optionally with self-host open-source LLMs
☆19Updated 6 months ago
parea-ai / parea-sdk-py
Python SDK for experimenting, testing, evaluating & monitoring LLM-powered applications - Parea AI (YC S23)
☆78Updated 5 months ago