EmbeddedLLM / embeddedllmLinks
EmbeddedLLM: API server for Embedded Device Deployment. Currently support CUDA/OpenVINO/IpexLLM/DirectML/CPU
☆46Updated last year
Alternatives and similar repositories for embeddedllm
Users that are interested in embeddedllm are comparing it to the libraries listed below
Sorting:
- Experimental Code for StructuredRAG: JSON Response Formatting with Large Language Models☆115Updated 8 months ago
- ☆40Updated last year
- Example implementation of Iteration of Tought - Gives a star if you like the project☆41Updated last year
- Self-host LLMs with vLLM and BentoML☆162Updated 3 weeks ago
- Own your AI, search the web with it🌐😎☆94Updated 11 months ago
- ☆22Updated last year
- Code for evaluating with Flow-Judge-v0.1 - an open-source, lightweight (3.8B) language model optimized for LLM system evaluations. Crafte…☆78Updated last year
- ☆57Updated this week
- ☆57Updated 10 months ago
- Official homepage for "Self-Harmonized Chain of Thought" (NAACL 2025)☆91Updated 11 months ago
- Simple examples using Argilla tools to build AI☆57Updated last year
- Training setup for Langchain's Open Deep Research☆73Updated 3 months ago
- Train, tune, and infer Bamba model☆137Updated 6 months ago
- Luth is a state-of-the-art series of fine-tuned LLMs for French☆40Updated 2 months ago
- ☆66Updated 8 months ago
- ⚡ Bhumi – The fastest AI inference client for Python, built with Rust for unmatched speed, efficiency, and scalability 🚀☆63Updated 2 months ago
- LLM reads a paper and produce a working prototype☆60Updated 8 months ago
- ☆101Updated last year
- frozen-in-time version of our Paper Finder agent for reproducing evaluation results☆214Updated 4 months ago
- Lightweight toolkit package to train and fine-tune 1.58bit Language models☆103Updated 7 months ago
- Benchmark and optimize LLM inference across frameworks with ease☆150Updated 3 months ago
- ☆92Updated last month
- Nexusflow function call, tool use, and agent benchmarks.☆30Updated last year
- vLLM: A high-throughput and memory-efficient inference and serving engine for LLMs☆94Updated this week
- Modular, open source LLMOps stack that separates concerns: LiteLLM unifies LLM APIs, manages routing and cost controls, and ensures high-…☆128Updated 10 months ago
- A Python library to orchestrate LLMs in a neural network-inspired structure☆52Updated last year
- Dynamic Metadata based RAG Framework☆78Updated 2 weeks ago
- ☆107Updated last month
- Machine Learning Serving focused on GenAI with simplicity as the top priority.☆59Updated 2 months ago
- Elasticsearch integration into LangChain☆69Updated last week