aiola-lab / whisper-nerLinks

Official implementation of "WhisperNER: Unified Open Named Entity and Speech Recognition"

☆195

Alternatives and similar repositories for whisper-ner

Users that are interested in whisper-ner are comparing it to the libraries listed below

Sorting:

tincans-ai / gazelle
Joint speech-language model - respond directly to audio!
☆370Updated last year
fiddlecube / compliant-llm
Build Secure and Compliant AI agents and MCP Servers. YC W23
☆147Updated 2 months ago
Vaibhavs10 / optimise-my-whisper
☆205Updated last year
aiola-lab / whisper-medusa
Whisper with Medusa heads
☆850Updated this week
umuthopeyildirim / DOOM-Mistral
Mistral7B playing DOOM
☆133Updated last year
KhoomeiK / interrupting-cow
🐮📢 The first AI voice assistant that interrupts *you*
☆149Updated 11 months ago
luweigen / whisper_streaming
Whisper realtime streaming for long speech-to-text transcription and translation
☆120Updated last year
Picovoice / orca
On-device streaming text-to-speech engine powered by deep learning
☆102Updated 2 weeks ago
arc53 / llm-price-compass
This project collects GPU benchmarks from various cloud providers and compares them to fixed per token costs. Use our tool for efficient …
☆221Updated 7 months ago
JigsawStack / insanely-fast-whisper-api
An API to transcribe audio with OpenAI's Whisper Large v3!
☆296Updated 8 months ago
MinishLab / vicinity
Lightweight Nearest Neighbors with Flexible Backends
☆296Updated 3 weeks ago
AugmendTech / treeseg
Hierarchical topic segmentation of meeting transcripts using embeddings and divisive clustering.
☆53Updated last year
agentsea / r1-computer-use
Applying the ideas of Deepseek R1 to computer use
☆216Updated 6 months ago
fzliu / radient
Radient turns many data types (not just text) into vectors for similarity search, RAG, regression analysis, and more.
☆279Updated last week
hedrergudene / asr-sd-pipeline
Speech recognition & diarisation solution with text alignment, deployed in AML pipelines
☆96Updated last year
babycommando / neuralgraffiti
Live-bending a foundation model’s output at neural network level.
☆266Updated 4 months ago
freddyaboulton / orpheus-cpp
Fast Streaming TTS with Orpheus + WebRTC (with FastRTC)
☆302Updated 3 months ago
trzy / llava-cpp-server
LLaVA server (llama.cpp).
☆181Updated last year
revdotcom / reverb-self-hosted
This public GitHub repository contains code for a fully self-hosted, on-premise transcription solution.
☆53Updated 7 months ago
sheet0 / npi
Action library for AI Agent
☆222Updated 4 months ago
AlexBodner / How_Much_VRAM
☆102Updated 11 months ago
bennyschmidt / ragdoll
The library for character-driven AI experiences.
☆88Updated last year
lechmazur / confabulations
Hallucinations (Confabulations) Document-Based Benchmark for RAG. Includes human-verified questions and answers.
☆197Updated this week
valine / NeuralFlow
Visualize the intermediate output of Mistral 7B
☆367Updated 6 months ago
Vaibhavs10 / translate-with-whisper
☆158Updated 2 years ago
llmonpy / needle-in-a-needlestack
☆116Updated 6 months ago
plaggy / fast-whisper-server
ASR + diarization model server with speculative decoding
☆62Updated last year
bennyschmidt / ragdoll-studio
The creative suite for character-driven AI experiences.
☆186Updated 11 months ago
IST-DASLab / PanzaMail
☆292Updated 4 months ago
ritabratamaiti / AnyModal
AnyModal is a Flexible Multimodal Language Model Framework for PyTorch
☆101Updated 7 months ago