Analyze videos using LLMs, Computer Vision and Automatic Speech Recognition
☆1,453Apr 19, 2026Updated last month
Alternatives and similar repositories for video-analyzer
Users that are interested in video-analyzer are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- 利用AI大模型,一键解说并剪辑视频; Using AI models to automatically provide commentary and edit videos with a single click.☆9,813Jun 10, 2026Updated last week
- Open-source, accurate and easy-to-use video speech recognition & clipping tool. LLM-based AI clipping integrated.☆5,826Jun 12, 2026Updated last week
- Official implementation of "Sonic: Shifting Focus to Global Audio Perception in Portrait Animation"☆3,255Jan 8, 2026Updated 5 months ago
- Taming Stable Diffusion for Lip Sync!☆5,773Jun 20, 2025Updated 11 months ago
- Netflix-level subtitle cutting, translation, alignment, and even dubbing - one-click fully automated AI video subtitle team | Netflix级字幕切…☆17,447Updated this week
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Clapper.app, a video synthesizer and sequencer designed for the age of AI cinema☆2,326Aug 1, 2025Updated 10 months ago
- Valley is a cutting-edge multimodal large model designed to handle a variety of tasks involving text, images, and video data.☆283May 8, 2026Updated last month
- 快速提取音视频内容,整理成一份结构化的markdown笔记☆2,147Jul 26, 2024Updated last year
- [ICLR2025] DisPose: Disentangling Pose Guidance for Controllable Human Image Animation☆379Nov 20, 2025Updated 6 months ago
- Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.☆21,667May 25, 2026Updated 3 weeks ago
- Industrial-grade speech recognition toolkit: 170x realtime, 50+ languages, speaker diarization, emotion detection, streaming, and OpenAI-…☆18,067Updated this week
- 一键将视频转换为优质小红书笔记,自动优化内容和配图☆1,780Oct 30, 2025Updated 7 months ago
- 🔥🔥First-ever hour scale video understanding models☆624Jul 14, 2025Updated 11 months ago
- Parse PDFs into markdown using Vision LLMs☆478Oct 4, 2025Updated 8 months ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- ☆169Oct 31, 2024Updated last year
- 🔍 An LLM-based Multi-agent Framework of Web Search Engine (like Perplexity.ai Pro and SearchGPT)☆6,873Jul 4, 2025Updated 11 months ago
- 利用AI大模型,一键生成高清短视频 Generate short videos with one click using AI LLM.☆87,414Updated this week
- Open-source alternative to Opus Clip, Vidyo.ai, Klap & SubMagic. Turn long-form YouTube videos into viral 9:16 shorts using LLM highlight…☆3,911Updated this week
- Real time interactive streaming digital human☆7,998Jun 11, 2026Updated last week
- AIGCPanel 是一个简单易用的一站式AI数字人系统,支持视频合成、声音合成、声音克隆,简化本地模型管理、一键导入和使用AI模型。☆5,125Jun 10, 2026Updated last week
- [KDD'2026] "VideoRAG: Chat with Your Videos"☆3,059Mar 18, 2026Updated 3 months ago
- Multilingual speech understanding: ASR + emotion recognition + audio event detection. 50+ languages, 15x faster than Whisper, non-autoreg…☆8,584Jun 9, 2026Updated last week
- SkyReels V1: The first and most advanced open-source human-centric video foundation model☆2,686Mar 10, 2025Updated last year
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- An AI-Powered Speech Processing Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Enhancement, Separation, and Target Spe…☆4,249Aug 14, 2025Updated 10 months ago
- SD变现宝:一键把comfyui工作流转换成小程序。☆1,504Jan 6, 2026Updated 5 months ago
- SOTA Open Source TTS☆30,816Jun 9, 2026Updated last week
- ☆3,539Jun 9, 2026Updated last week
- FastGPT is a knowledge-based platform built on the LLMs, offers a comprehensive suite of out-of-the-box capabilities such as data process…☆28,435Updated this week
- Frontier Multimodal Foundation Models for Image and Video Understanding☆1,165Aug 14, 2025Updated 10 months ago
- High-Quality Human Motion Video Generation with Confidence-aware Pose Guidance☆2,604Nov 18, 2025Updated 7 months ago
- [CVPR 2025] MMAudio: Taming Multimodal Joint Training for High-Quality Video-to-Audio Synthesis☆2,206Feb 23, 2026Updated 3 months ago
- [ICLR 2026] A Training-free Iterative Framework for Long Story Visualization☆958Apr 2, 2026Updated 2 months ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- A generative speech model for daily dialogue.☆39,469Apr 10, 2026Updated 2 months ago
- An open-sourced end-to-end VLM-based GUI Agent☆1,185Apr 4, 2025Updated last year
- 🌐 Make websites accessible for AI agents. Automate tasks online with ease.☆99,362Updated this week
- Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"☆14,729May 18, 2026Updated last month
- [under review] The official implementation of paper "BrushEdit: All-In-One Image Inpainting and Editing"☆588Sep 3, 2025Updated 9 months ago
- MuseTalk: Real-Time High Quality Lip Synchorization with Latent Space Inpainting☆5,981Sep 26, 2025Updated 8 months ago
- Ingest, parse, and optimize any data format ➡️ from documents to multimedia ➡️ for enhanced compatibility with GenAI frameworks☆7,611Dec 12, 2025Updated 6 months ago