byjlw/video-analyzer

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/byjlw/video-analyzer)

byjlw / video-analyzer

Analyze videos using LLMs, Computer Vision and Automatic Speech Recognition

☆1,492

Alternatives and similar repositories for video-analyzer

Users that are interested in video-analyzer are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

linyqh / NarratoAI
View on GitHub
利用AI大模型，一键解说并剪辑视频； Using AI models to automatically provide commentary and edit videos with a single click.
☆10,148Updated this week
modelscope / FunClip
View on GitHub
FunASR-powered video transcription, subtitle generation, and LLM-assisted clipping tool with a local Gradio UI.
☆5,901Updated this week
jixiaozhong / Sonic
View on GitHub
Official implementation of "Sonic: Shifting Focus to Global Audio Perception in Portrait Animation"
☆3,264Jan 8, 2026Updated 6 months ago
bytedance / LatentSync
View on GitHub
Taming Stable Diffusion for Lip Sync!
☆5,848Jun 20, 2025Updated last year
Huanshere / VideoLingo
View on GitHub
Netflix-level subtitle cutting, translation, alignment, and even dubbing - one-click fully automated AI video subtitle team | Netflix级字幕切…
☆17,623Jul 2, 2026Updated last week
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
jbilcke-hf / clapper
View on GitHub
Clapper.app, a video synthesizer and sequencer designed for the age of AI cinema
☆2,327Aug 1, 2025Updated 11 months ago
bytedance / Valley
View on GitHub
Valley is a cutting-edge multimodal large model designed to handle a variety of tasks involving text, images, video, and audio data.
☆285May 8, 2026Updated 2 months ago
harry0703 / AudioNotes
View on GitHub
快速提取音视频内容，整理成一份结构化的markdown笔记
☆2,175Jul 26, 2024Updated last year
lihxxx / DisPose
View on GitHub
[ICLR2025] DisPose: Disentangling Pose Guidance for Controllable Human Image Animation
☆379Nov 20, 2025Updated 7 months ago
FunAudioLLM / CosyVoice
View on GitHub
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
☆21,992May 25, 2026Updated last month
modelscope / FunASR
View on GitHub
Industrial-grade speech recognition toolkit: 170x realtime, 50+ languages, speaker diarization, emotion detection, streaming, and OpenAI-…
☆18,996Updated this week
VectorSpaceLab / Video-XL
View on GitHub
🔥🔥First-ever hour scale video understanding models
☆626Jul 14, 2025Updated 11 months ago
whotto / Video_note_generator
View on GitHub
一键将视频转换为优质小红书笔记，自动优化内容和配图
☆1,789Oct 30, 2025Updated 8 months ago
iamarunbrahma / vision-parse
View on GitHub
Parse PDFs into markdown using Vision LLMs
☆480Oct 4, 2025Updated 9 months ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
parsakhaz / video-understanding-engine
View on GitHub
A powerful video summarization tool that utilizes Moondream alongside multiple AI models to provide comprehensive video understanding thr…
☆28Jan 28, 2025Updated last year
Mustafa-Esoofally / podcast-engine-groq
View on GitHub
☆168Oct 31, 2024Updated last year
InternLM / MindSearch
View on GitHub
🔍 An LLM-based Multi-agent Framework of Web Search Engine (like Perplexity.ai Pro and SearchGPT)
☆6,886Jul 4, 2025Updated last year
harry0703 / MoneyPrinterTurbo
View on GitHub
利用AI大模型，一键生成高清短视频 Generate short videos with one click using AI LLM.
☆95,700Jul 3, 2026Updated last week
SamurAIGPT / AI-Youtube-Shorts-Generator
View on GitHub
Open-source alternative to Opus Clip, Vidyo.ai, Klap & SubMagic. Turn long-form YouTube videos into viral 9:16 shorts using LLM highlight…
☆4,170Jun 22, 2026Updated 2 weeks ago
lipku / LiveTalking
View on GitHub
Real time interactive streaming digital human
☆8,165Updated this week
modstart-lib / aigcpanel
View on GitHub
AIGCPanel 是一个简单易用的一站式AI数字人系统，支持视频合成、声音合成、声音克隆，简化本地模型管理、一键导入和使用AI模型。
☆5,236Jun 28, 2026Updated last week
HKUDS / VideoRAG
View on GitHub
[KDD'2026] "VideoRAG: Chat with Your Videos"
☆3,094Mar 18, 2026Updated 3 months ago
FunAudioLLM / SenseVoice
View on GitHub
Multilingual speech understanding: ASR + emotion recognition + audio event detection. 50+ languages, 15x faster than Whisper, non-autoreg…
☆8,786Jun 29, 2026Updated last week
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
SkyworkAI / SkyReels-V1
View on GitHub
SkyReels V1: The first and most advanced open-source human-centric video foundation model
☆2,691Mar 10, 2025Updated last year
modelscope / ClearerVoice-Studio
View on GitHub
An AI-Powered Speech Processing Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Enhancement, Separation, and Target Spe…
☆4,289Aug 14, 2025Updated 10 months ago
zhulu111 / ComfyUI_Bxb
View on GitHub
SD变现宝：一键把comfyui工作流转换成小程序。
☆1,510Jan 6, 2026Updated 6 months ago
fishaudio / fish-speech
View on GitHub
SOTA Open Source TTS
☆31,116Jun 9, 2026Updated last month
HumanAIGC-Engineering / OpenAvatarChat
View on GitHub
☆3,594Jun 9, 2026Updated last month
labring / FastGPT
View on GitHub
FastGPT is a knowledge-based platform built on the LLMs, offers a comprehensive suite of out-of-the-box capabilities such as data process…
☆28,851Updated this week
DAMO-NLP-SG / VideoLLaMA3
View on GitHub
Frontier Multimodal Foundation Models for Image and Video Understanding
☆1,171Aug 14, 2025Updated 10 months ago
Tencent / MimicMotion
View on GitHub
High-Quality Human Motion Video Generation with Confidence-aware Pose Guidance
☆2,622Nov 18, 2025Updated 7 months ago
hkchengrex / MMAudio
View on GitHub
[CVPR 2025] MMAudio: Taming Multimodal Joint Training for High-Quality Video-to-Audio Synthesis
☆2,231Feb 23, 2026Updated 4 months ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
UCSC-VLAA / story-iter
View on GitHub
[ICLR 2026] A Training-free Iterative Framework for Long Story Visualization
☆959Apr 2, 2026Updated 3 months ago
2noise / ChatTTS
View on GitHub
A generative speech model for daily dialogue.
☆39,574Apr 10, 2026Updated 2 months ago
zai-org / CogAgent
View on GitHub
An open-sourced end-to-end VLM-based GUI Agent
☆1,186Apr 4, 2025Updated last year
browser-use / browser-use
View on GitHub
🌐 Make websites accessible for AI agents. Automate tasks online with ease.
☆103,453Updated this week
TencentARC / BrushEdit
View on GitHub
[under review] The official implementation of paper "BrushEdit: All-In-One Image Inpainting and Editing"
☆587Sep 3, 2025Updated 10 months ago
SWivid / F5-TTS
View on GitHub
Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"
☆14,904Updated this week
TMElyralab / MuseTalk
View on GitHub
MuseTalk: Real-Time High Quality Lip Synchorization with Latent Space Inpainting
☆6,126Sep 26, 2025Updated 9 months ago