Lightweight continuous batching OpenAI compatibility using HuggingFace Transformers include T5 and Whisper.
☆29Mar 15, 2025Updated last year
Alternatives and similar repositories for transformers-continuous-batching
Users that are interested in transformers-continuous-batching are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆11Feb 20, 2025Updated last year
- Welcome to the RAG University repository! This repository contains code implementations for Retrieval-Augmented Generation (RAG) models, …☆23Dec 18, 2023Updated 2 years ago
- CompChomper is a framework for measuring how LLMs perform at code completion.☆21Apr 29, 2025Updated last year
- An fully autonomous agent that accesses the browser and performs tasks.☆18Apr 25, 2025Updated last year
- an auto-sleeping and -waking framework around llama.cpp☆13Feb 8, 2025Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- A open webui function for better R1 experience☆77Mar 7, 2025Updated last year
- Run Orpheus 3B Locally with Gradio UI, Standalone App☆24Apr 1, 2025Updated last year
- SPLAA is an AI assistant framework that utilizes voice recognition, text-to-speech, and tool-calling capabilities to provide a conversati…☆29May 6, 2025Updated last year
- Offline LLM chatbot with personalized memory — works on CPU with multi-session memory support.☆22Jan 10, 2026Updated 5 months ago
- Pre-training Language Models for Japanese☆50Jul 2, 2023Updated 2 years ago
- Utility to use eleven lab's streaming to in the command line☆11Aug 8, 2023Updated 2 years ago
- The official repo of the paper "StressTest: Can YOUR Speech LM Handle the Stress?"☆20Jul 9, 2025Updated 11 months ago
- Modified Beam Search with periodical restart☆12Sep 12, 2024Updated last year
- ☆24Jan 22, 2025Updated last year
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- implementation of https://arxiv.org/pdf/2312.09299☆21Jul 3, 2024Updated last year
- ☆16Dec 16, 2024Updated last year
- Demo app with Loguru logging, async middleware to generate X-request-Id. Works with Gunicorn or Uvicorn, and is safe to use with async/th…☆10Feb 2, 2022Updated 4 years ago
- The Python Implementation of CRISP: Clustering Multi-Vector Representations for Denoising and Pruning☆27Jul 27, 2025Updated 10 months ago
- An extension of MCP for SillyTavern.☆86Updated this week
- A simple node.js MRCP (v.2) library☆11Oct 26, 2024Updated last year
- ☆18Sep 4, 2024Updated last year
- Running Microsoft's BitNet inference framework via FastAPI, Uvicorn and Docker.☆38Jul 2, 2025Updated 11 months ago
- Yet another frontend for LLM, written using .NET and WinUI 3☆11Sep 14, 2025Updated 9 months ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- ☆40Mar 25, 2023Updated 3 years ago
- Multi-turn dataset management tool for LLM trainers☆13Mar 31, 2025Updated last year
- ☆195Dec 18, 2025Updated 5 months ago
- Speaker change detection using SincNet and an LSTM/Transformer☆58May 26, 2025Updated last year
- A Windows tool to query various LLM AIs. Supports branched conversations, history and summaries among others.☆36May 11, 2026Updated last month
- Protocol for Augmented Memory of Project Artifacts (MCP compatible) - extended☆25Jan 24, 2026Updated 4 months ago
- Simple node proxy for llama-server that enables MCP use☆19May 10, 2025Updated last year
- An OpenAI API compatible FastAPI server that sits on top of the Anemll repo. Tested with Open WebUI.☆21Jan 21, 2026Updated 4 months ago
- [ALPHA] Persist and recall information across any AI tooling, powered by SQLite + MCP + Local Embeddings☆20Jun 12, 2025Updated last year
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- A pytorch implementation of a text to videos GAN☆12Jul 26, 2019Updated 6 years ago
- input aspect ratio, output dimensions☆21Mar 13, 2026Updated 3 months ago
- A forward proxy to turn network traffic into personal memory for AI agents☆38Mar 30, 2026Updated 2 months ago
- Pybind11 bindings for Kaldi☆15Feb 1, 2026Updated 4 months ago
- A simple wrapper around "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching" that provides an OpenAI-compatibl…☆14Feb 7, 2025Updated last year
- Digital Signal Processing for Big EEGs☆13Feb 9, 2026Updated 4 months ago
- Run Orpheus 3B Locally With LM Studio☆32Mar 20, 2025Updated last year