Dynamic batching library for Deep Learning inference. Tutorials for LLM, GPT scenarios.
☆106Aug 14, 2024Updated last year
Alternatives and similar repositories for batch-inference
Users that are interested in batch-inference are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Python Inference Script(PyIS)☆19Aug 30, 2022Updated 3 years ago
- ☆11Jul 3, 2023Updated 2 years ago
- LLM Serving Performance Evaluation Harness☆85Feb 25, 2025Updated last year
- 🗣️ Convert between phonetic alphabets☆11Feb 7, 2022Updated 4 years ago
- Simple and easy stable diffusion inference with LightningModule on GPU, CPU and MPS (Possibly all devices supported by Lightning).☆17Jul 27, 2023Updated 2 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- A time delay estimation method for event-based time-series data. Time delay estimation is also known as the correction of time offsets an…☆15Dec 3, 2025Updated 5 months ago
- golang vad (voice activity detection) library based on webrtc☆12Dec 13, 2021Updated 4 years ago
- KenLM extension for spaCy 2.0.☆16Dec 6, 2017Updated 8 years ago
- The helm chart for setting up SearXNG with kubernetes.☆45Mar 10, 2025Updated last year
- Testing various methods of moving Arrow data between processes☆17Mar 29, 2023Updated 3 years ago
- Code for the paper "RIR-in-a-Box : Estimating Room Acoustics from 3D Mesh Data through Shoebox Approximation" presented at Interspeech 20…☆16Sep 1, 2024Updated last year
- ☆22Dec 3, 2021Updated 4 years ago
- superfast text to speech in any voice☆62Feb 16, 2026Updated 3 months ago
- Entity Linking within a Social Media Platform☆11May 2, 2019Updated 7 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Summary of system papers/frameworks/codes/tools on training or serving large model☆57Dec 17, 2023Updated 2 years ago
- High-performance vector search engine with no loss of accuracy through GPU and dynamic placement☆32Jul 12, 2025Updated 10 months ago
- Many ASRs under one roof. With Benchmarking... answering the question. What is the best ASR for my dataset?☆19Oct 5, 2022Updated 3 years ago
- Chain-of-thought 방식을 활용하여 llama2를 fine-tuning☆10Nov 18, 2023Updated 2 years ago
- A RAG system is just the beginning of harnessing the power of LLM. The next step is creating an intelligent Agent. In Agentic RAG the Ag…☆14May 31, 2024Updated last year
- experiments with inference on llama☆103Jun 6, 2024Updated last year
- speaker-disentangled speech linguistic content quantizer☆25Mar 19, 2025Updated last year
- The official implementation of the method discussed in the paper Improving Spoken Language Identification with Map-Mix(work accepted at I…☆18Feb 17, 2023Updated 3 years ago
- Julia implementation of flash-attention operation for neural networks.☆11May 31, 2023Updated 2 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- ☆17Nov 9, 2024Updated last year
- Seldon Core Operator for Kubernetes☆13Nov 5, 2019Updated 6 years ago
- A STFT/iSTFT written up in PyTorch using 1D Convolutions☆32Jul 9, 2024Updated last year
- 🐆A lightweight, high-performance string manipulation library optimized for speed-sensitive applications.☆16Mar 28, 2026Updated last month
- An ultimate pdf file disintegration tool☆11Jun 12, 2020Updated 5 years ago
- 来自于文章Paraformer-v2: An improved non-autoregressive transformer for noise-robust speech recognition☆29Nov 20, 2024Updated last year
- High performance async Mssql library for Python.☆22Updated this week
- ☆31Sep 3, 2023Updated 2 years ago
- Google Play InApp Billing v3 Example☆11Mar 21, 2021Updated 5 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- TensorRT☆11Sep 22, 2020Updated 5 years ago
- Go port of the metaphone3 algorithm☆25Sep 3, 2019Updated 6 years ago
- Born to flag outliers☆30May 8, 2026Updated last week
- Streaming JSON parser designed to process JSON data incrementally. The primary goal is to handle potentially incomplete JSON data streams…☆13Apr 5, 2025Updated last year
- Mike/Projects/pysilero-vad.git☆24Feb 12, 2026Updated 3 months ago
- ☆18May 4, 2025Updated last year
- ☆539Apr 1, 2026Updated last month