byjlw / video-analyzer
A comprehensive video analysis tool that combines computer vision, audio transcription, and natural language processing to generate detailed descriptions of video content. This tool extracts key frames from videos, transcribes audio content, and produces natural language descriptions of the video's content.
☆617Updated this week
Alternatives and similar repositories for video-analyzer:
Users that are interested in video-analyzer are comparing it to the libraries listed below
- A Training-free Iterative Framework for Long Story Visualization☆819Updated last month
- An open-sourced end-to-end VLM-based GUI Agent☆802Updated 2 weeks ago
- Easegen is an open-source digital human course creation platform offering comprehensive solutions from course production and video manage…☆194Updated 3 weeks ago
- 实时语音交互数字人,支持端到端语音方案(GLM-4-Voice - THG)和级联方案(ASR-LLM-TTS-THG)。可自定义形象与音色,无须训练,支持音色克隆,首包延迟低至3s。Real-time voice interactive digital human, su…☆746Updated 3 months ago
- The fastest digital human algorithm, now on your desktop.☆459Updated 2 months ago
- ☆561Updated 4 months ago
- JoyHallo: Digital human model for Mandarin☆454Updated 3 months ago
- Fuse ChatTTS with OpenVoice, upload a 10-second audio clip, and clone your personalized ChatTTS voice.☆416Updated 4 months ago
- AI ContentCraft is an all-in-one content creation suite that helps creators generate stories, podcast scripts, and multimedia content usi…☆316Updated last month
- talking-face video editing☆266Updated last week
- OmniThink: Expanding Knowledge Boundaries in Machine Writing through Thinking☆406Updated last week
- CogView4, CogView3-Plus and CogView3(ECCV 2024)☆806Updated this week
- ☆149Updated last month
- gradio WebUI for AdvancedLivePortrait☆454Updated 2 months ago
- 🍦 Speech-AI-Forge is a project developed around TTS generation model, implementing an API Server and a Gradio-based WebUI.☆1,093Updated this week
- Repo for NAACL 2025 Paper "Unfolding the Headline: Iterative Self-Questioning for News Retrieval and Timeline Summarization"☆255Updated last month
- 🌐 WebWalker: Benchmarking LLMs in Web Traversal☆357Updated last month
- ☆501Updated 3 months ago
- [AAAI 2025] StoryWeaver: A Unified World Model for Knowledge-Enhanced Story Character Customization☆193Updated 3 weeks ago
- ☆410Updated 2 weeks ago
- PPTAgent: Generating and Evaluating Presentations Beyond Text-to-Slides☆730Updated this week
- 一个用于CosyVoice的api接口项目☆229Updated last month
- The official implementation of paper "BrushEdit: All-In-One Image Inpainting and Editing"☆526Updated 2 months ago
- ☆338Updated 7 months ago