Tutorial for LLM developers about engine design, service deployment, evaluation/benchmark, etc. Provide a C/S style optimized LLM inference engine.
☆19Sep 5, 2023Updated 2 years ago
Alternatives and similar repositories for LLM-Inference-Deployment-Tutorial
Users that are interested in LLM-Inference-Deployment-Tutorial are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Deploy, launch and use LLMs on AWS☆16Jun 2, 2023Updated 2 years ago
- langchain-streamlit demo with streaming llm, memory, and langsmith feedback☆17Feb 4, 2026Updated 2 months ago
- ☆16Sep 4, 2023Updated 2 years ago
- SadTalker gradio_demo.py file with code section that allows you to set the eye blink and pose reference videos for the software to use wh…☆11Jun 20, 2023Updated 2 years ago
- Code associated with the paper **Fine-tuning Language Models over Slow Networks using Activation Compression with Guarantees**.☆28Apr 25, 2023Updated 3 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- 🚀 Automatically convert unstructured data into a high-quality 'textbook' format, optimized for fine-tuning Large Language Models (LLMs)☆26Oct 15, 2023Updated 2 years ago
- Website for Stanford SysML Seminar☆17Oct 27, 2025Updated 6 months ago
- Codes of the paper Deformable Butterfly: A Highly Structured and Sparse Linear Transform.☆16Nov 1, 2021Updated 4 years ago
- ☆31Jun 13, 2023Updated 2 years ago
- 基于DINet的推理服务,推理视频流和视频☆17Nov 8, 2023Updated 2 years ago
- On-device real-time RAG App built using Jina Reader, Mediapipe, Gemma 2b IT LLM.☆15Apr 15, 2024Updated 2 years ago
- Chat with your Database! Natural language to SQL with a friendly UI. LangChain+Streamlit+SQL Agents with SQLAlchemy wrap-up (BigQuery/MyS…☆25Dec 7, 2023Updated 2 years ago
- Machine Learning System☆14May 11, 2020Updated 5 years ago
- Created an AI model that is proficient in lip-syncing i.e. synchronizing an audio file with a video file using Wav2Lip.☆10Oct 28, 2023Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- ☆15Jan 11, 2024Updated 2 years ago
- ☆15May 12, 2022Updated 3 years ago
- Production First and Production Ready End-to-End Keyword Spotting Toolkit☆12May 30, 2022Updated 3 years ago
- deepnn architecture for generating monophonic melodies conditioned on chords☆13Apr 15, 2019Updated 7 years ago
- TaskWeaver Plugins☆12Jan 28, 2024Updated 2 years ago
- Face Parsing via SegNeXt, trained on CelebAMask-HQ☆18Dec 21, 2023Updated 2 years ago
- ☆19Sep 15, 2022Updated 3 years ago
- Super Mario is a legendary game we all cherish! In this project, we will deploy Super Mario on Amazon EKS (Elastic Kubernetes Service) us…☆11Feb 3, 2026Updated 2 months ago
- Chainer implementation of Graph Neural Networks for the Prediction of Substrate-Specific Organic Reaction Conditions☆10Apr 11, 2021Updated 5 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- pytorch code examples for measuring the performance of collective communication calls in AI workloads☆20Sep 18, 2025Updated 7 months ago
- netbeacon - monitoring your network capture, NIDS or network analysis process☆20Apr 5, 2026Updated 3 weeks ago
- Never forget the resource that helps to close that sales call! Power a real-time speech-to-text agent with retrieval augmented generation…☆14Jan 23, 2024Updated 2 years ago
- ATC23 AE☆45May 11, 2023Updated 2 years ago
- Implementations of transformer models in pytorch☆14Jun 2, 2020Updated 5 years ago
- ☆26Oct 2, 2023Updated 2 years ago
- Fleming-R1: Toward Expert-Level Medical Reasoning via Reinforcement Learning☆31Sep 29, 2025Updated 7 months ago
- Implementation of an attack/decay model for piano transcription☆11Feb 1, 2018Updated 8 years ago
- Explore Inter-layer Expert Affinity in MoE Model Inference☆16May 6, 2024Updated last year
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- ☆14Apr 20, 2023Updated 3 years ago
- Transform audio files into mel spectrograms for text-to-speech model training☆12Aug 25, 2021Updated 4 years ago
- Scripts for reading, extracting, and organizing data from either HTML or PDF documents and prepare them to be converted into embeddings f…☆13Aug 26, 2024Updated last year
- The official repository of "SCANet: Real-Time Face Parsing Using Spatial and Channel Attention," presented at the 2023 UR (Ubiquitous Rob…☆18Sep 15, 2023Updated 2 years ago
- A simple application of DTW Algorithm in isolate word speech recognition.☆17Mar 9, 2020Updated 6 years ago
- A compute framework for building Search, RAG, Recommendations and Analytics over complex (structured+unstructured) data, with ultra-modal…☆11Sep 16, 2024Updated last year
- This repository will take you through creating a FastAPI StableDiffusion app (including Dockerfile) all the way to adding a new feature u…☆38Nov 9, 2022Updated 3 years ago