ElmiraGhorbani / gpt-speaker-diarizationLinks
Conversational Speaker Diarization using OpenAI AI Language Models(gpt-4) and OpenAI Whisper.
☆14Updated 2 years ago
Alternatives and similar repositories for gpt-speaker-diarization
Users that are interested in gpt-speaker-diarization are comparing it to the libraries listed below
Sorting:
- 🤖 Quantum-powered excuse generator for developers. Blame bugs on cosmic rays, AI sentience, or Schrödinger’s intern.☆26Updated 2 months ago
- Retrieval Augmented Generation for youtube videos with a BRAD agent☆33Updated 9 months ago
- AI Lip Syncing application, deployed on Streamlit☆43Updated last year
- This repository features a Gradio interface designed to leverage the OpenAI Text-To-Speech (TTS) API. The interface lets users create spe…☆14Updated 2 years ago
- An open source chat bot architecture for voice/vision (and multimodal) assistants, local(CPU/GPU bound) and remote(I/O bound) to run.☆84Updated last week
- High-performance ASR tool using Faster Whisper, supporting custom models, multi-language transcription, and real-time processing feedback…☆10Updated last month
- Small demos demonstrating different capabilities of LiveKit Agents☆20Updated 7 months ago
- AgentParse is a high-performance parsing library designed to map various structured data formats (such as Pydantic models, JSON, YAML, an…☆16Updated 3 weeks ago
- A proposed GPT chatbot for teachers that uses retrieval-augmentation to answer questions about their students.☆10Updated 11 months ago
- The UnisonAI Multi-Agent Framework built on custom workflow which allows ai agents to talk together and provides a flexible and extensibl…☆22Updated 2 weeks ago
- A python library to find differences between audio and transcriptions☆19Updated last year
- A chatbot UI for RAG, multimodal, text completion. (support Transformers, llama.cpp, MLX, vLLM)☆19Updated last year
- ASR + diarization model server with speculative decoding☆63Updated last year
- ☆15Updated 5 months ago
- Multimodal Open Source Framework for Conversational Agent Research and Development.☆21Updated 8 months ago
- Advanced Coding AI Assistant that uses a Gradio interface to stream coding related responses. ChatRAG supports local and API inference an…☆22Updated 6 months ago
- A WebRTC server that allows you to interact with an LLM using your speech and responds back with generated audio.☆139Updated last year
- An audio processing tool for detecting and removing silence in audio recordings. Create text files for video silence removal using custom…☆25Updated 5 months ago
- next level Autogen with teams, tools and training to reach the goal. -Deprecated-☆80Updated last year
- Get started using Deepgram's Live Transcription with this Flask demo app☆40Updated 2 weeks ago
- A library for real-time Speech to Text (STT), and Text to Speech (TTS) capability☆44Updated last year
- PlayHT Python SDK - AI Text-to-Speech Streaming & Voice Cloning API☆217Updated 2 months ago
- Transform unstructured documents into actionable, structured data with enterprise-grade precision and reliability, ready for large-scale …☆19Updated 3 weeks ago
- BanterBot: An OpenAI ChatGPT-powered chatbot with Azure Neural Voices. Supports multilingual speech-to-text and text-to-speech interactio…☆11Updated 4 months ago
- A high-performance, distributed memory management system for LLM agents built with LangGraph, LangChain, Ray, and vLLM. Features multi-la…☆11Updated 6 months ago
- python skills for autogen☆31Updated last year
- AI Talking Head: create video from plain text or audio file in minutes, support up to 100+ languages and 350+ voice models.☆36Updated 2 years ago
- VideoDB Python SDK☆84Updated last month
- Agent Studio is an AI agent application designed to handle real-time interactions through phone calls, web-based voice user interfaces (V…☆45Updated last year
- A Full-Duplex Open-Domain Dialogue Agent with Continuous Turn-Taking Behavior☆34Updated 2 years ago