Dynamic batching library for Deep Learning inference. Tutorials for LLM, GPT scenarios.
☆106Aug 14, 2024Updated last year
Alternatives and similar repositories for batch-inference
Users that are interested in batch-inference are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Predict the performance of LLM inference services☆23Sep 18, 2025Updated 8 months ago
- aigc evals☆10Dec 2, 2023Updated 2 years ago
- ☆23Mar 24, 2023Updated 3 years ago
- pytorch版bert权重转tf☆22May 19, 2020Updated 6 years ago
- Unsupervised Voice Activity Detection by Modeling Source and System Information using Zero Frequency Filtering☆23Oct 19, 2023Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Transfromer tensorflow2.0版本实现☆26Mar 25, 2023Updated 3 years ago
- ☆18Jan 2, 2024Updated 2 years ago
- 🗣️ Convert between phonetic alphabets☆11Feb 7, 2022Updated 4 years ago
- ☆18Dec 27, 2023Updated 2 years ago
- Token-Level Ensemble Distillation for Grapheme-to-Phoneme Conversion☆20Jul 9, 2019Updated 6 years ago
- Simple and easy stable diffusion inference with LightningModule on GPU, CPU and MPS (Possibly all devices supported by Lightning).☆16Jul 27, 2023Updated 2 years ago
- A time delay estimation method for event-based time-series data. Time delay estimation is also known as the correction of time offsets an…☆16Dec 3, 2025Updated 6 months ago
- MII makes low-latency and high-throughput inference possible, powered by DeepSpeed.☆2,107Jun 30, 2025Updated 11 months ago
- golang vad (voice activity detection) library based on webrtc☆12Dec 13, 2021Updated 4 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- StrategyQA 데이터 세트 번역☆22Apr 12, 2024Updated 2 years ago
- Spoken Language Identification on Common Voice and AudioSet using Deep Learning☆41Feb 4, 2026Updated 4 months ago
- Code for the paper "RIR-in-a-Box : Estimating Room Acoustics from 3D Mesh Data through Shoebox Approximation" presented at Interspeech 20…☆16Sep 1, 2024Updated last year
- A powerful Laravel storage driver that enables seamless synchronization of files across multiple disks, with an integrated cache disk for…☆15Nov 11, 2025Updated 6 months ago
- superfast text to speech in any voice☆62Feb 16, 2026Updated 3 months ago
- PHO-LID: A Unified Model to Incorporate Acoustic-Phonetic and Phonotactic Information for Language Identification☆21Aug 24, 2023Updated 2 years ago
- High-performance vector search engine with no loss of accuracy through GPU and dynamic placement☆32Jul 12, 2025Updated 10 months ago
- aicreator for aidata☆14May 17, 2023Updated 3 years ago
- ☆125Mar 17, 2024Updated 2 years ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- Navigation Turing Test (NTT): Learning to Evaluate Human-Like Navigation [ICML 2021]☆14Jul 17, 2025Updated 10 months ago
- Many ASRs under one roof. With Benchmarking... answering the question. What is the best ASR for my dataset?☆19Oct 5, 2022Updated 3 years ago
- Know Your Enemy To Save Cloud Energy: Energy-Performance Characterization of Machine Learning Serving (HPCA '23)☆14Jun 20, 2025Updated 11 months ago
- Chain-of-thought 방식을 활용하여 llama2를 fine-tuning☆10Nov 18, 2023Updated 2 years ago
- A RAG system is just the beginning of harnessing the power of LLM. The next step is creating an intelligent Agent. In Agentic RAG the Ag…☆14May 31, 2024Updated 2 years ago
- PyTorch implementation of WaveFit [2022, Google] which is one of SOTA lightweight/fast speech vocoders.☆66Sep 8, 2025Updated 9 months ago
- BATCH: Adaptive Batching for Efficient MachineLearning Serving on Serverless Platforms☆11Aug 7, 2021Updated 4 years ago
- Quick useful examples of data science & ML & big data☆16Jun 12, 2023Updated 2 years ago
- The official implementation of the method discussed in the paper Improving Spoken Language Identification with Map-Mix(work accepted at I…☆18Feb 17, 2023Updated 3 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Edutainment game teaching players concepts around machine learning☆15Feb 18, 2020Updated 6 years ago
- stop updating, further reading, pls go to https://github.com/rgtjf/Paper-Reading-Third-Edition☆11Oct 8, 2017Updated 8 years ago
- A high-throughput and memory-efficient inference and serving engine for LLMs☆12Nov 14, 2025Updated 6 months ago
- Julia implementation of flash-attention operation for neural networks.☆11May 31, 2023Updated 3 years ago
- Adds a Doctrine Id generator which uses an ordered UUID in MySQL for extra performance. Uses methods described in Karhik Appigatla's arti…☆10Jun 8, 2015Updated 11 years ago
- 💾 Optimize Laravel caching with Cachetastic! Cache method results, force refresh, handle errors, and boost app performance effortlessly.☆13Jan 26, 2026Updated 4 months ago
- spotify cli for the official client via dbus☆13May 5, 2020Updated 6 years ago