EmbeddedLLM / embeddedllm
EmbeddedLLM: API server for Embedded Device Deployment. Currently support CUDA/OpenVINO/IpexLLM/DirectML/CPU
☆32Updated 4 months ago
Alternatives and similar repositories for embeddedllm:
Users that are interested in embeddedllm are comparing it to the libraries listed below
- Machine Learning Serving focused on GenAI with simplicity as the top priority.☆58Updated last month
- vLLM: A high-throughput and memory-efficient inference and serving engine for LLMs☆88Updated this week
- A list of language models with permissive licenses such as MIT or Apache 2.0☆24Updated 3 months ago
- LLM reads a paper and produce a working prototype☆48Updated 2 weeks ago
- Very minimal (and stateless) agent framework☆41Updated last month
- Tcurtsni: Reverse Instruction Chat, ever wonder what your LLM wants to ask you?☆22Updated 7 months ago
- Dataset Viber is your chill repo for data collection, annotation and vibe checks.☆44Updated 5 months ago
- Simple examples using Argilla tools to build AI☆53Updated 3 months ago
- ☆78Updated last month
- Use Grounding DINO, Segment Anything, and CLIP to label objects in images.☆27Updated last year
- Transform unstructured documents into actionable, structured data with enterprise-grade precision and reliability, ready for large-scale …☆18Updated last week
- OpenMindedChatbot is a Proof Of Concept that leverages the power of Open source Large Language Models (LLM) with Function Calling capabil…☆29Updated last year
- Train, tune, and infer Bamba model☆85Updated last month
- ☆41Updated 2 months ago
- 🚀 Scale your RAG pipeline using Ragswift: A scalable centralized embeddings management platform☆37Updated last year
- ☆45Updated last year
- Self-host LLMs with vLLM and BentoML☆87Updated this week
- GPT-4 Level Conversational QA Trained In a Few Hours☆58Updated 6 months ago
- Set of scripts to finetune LLMs☆36Updated 10 months ago
- High level library for batched embeddings generation, blazingly-fast web-based RAG and quantized indexes processing ⚡☆64Updated 3 months ago
- ☆20Updated last year
- A public implementation of the ReLoRA pretraining method, built on Lightning-AI's Pytorch Lightning suite.☆33Updated 11 months ago
- GRDN.AI app for garden optimization☆70Updated last year
- Deployment a light and full OpenAI API for production with vLLM to support /v1/embeddings with all embeddings models.☆40Updated 7 months ago
- ☆12Updated 11 months ago
- ☆99Updated 5 months ago
- ☆20Updated 10 months ago
- ☆18Updated 3 months ago
- Practical and advanced guide to LLMOps. It provides a solid understanding of large language models’ general concepts, deployment techniqu…☆59Updated 6 months ago
- Streamlit app for recommending eval functions using prompt diffs☆27Updated last year