scliubit / PPT2VideoLinks
generate video with voice narration from ppt/pdf Slides
☆10Updated last year
Alternatives and similar repositories for PPT2Video
Users that are interested in PPT2Video are comparing it to the libraries listed below
Sorting:
- Simple, Unified Repository for Retrieval-based Voice Conversion☆17Updated last year
- Code for paper: "Privately generating tabular data using language models".☆15Updated 2 years ago
- Deep metric learning: Triplet, Magnet and VMF loss☆11Updated 3 years ago
- ☆10Updated last year
- A curated list of resources in audio visual question answering and related area. :-)☆12Updated last month
- This is the code for the "Robust Gait Recognition based on Deep CNNs with Camera and Radar Sensor Fusion".☆13Updated 2 years ago
- Taking advantage of LlamaIndex's in-context learning paradigm, LlamaDoc empowers users to input PDF documents and pose any questions rela…☆14Updated 2 years ago
- Indic-Conformer models for ASR☆18Updated last year
- Apply an end-to-end model structure (ViT + GPT) to describe images in more detail, rather than traditional image captioning that only pro…☆11Updated 7 months ago
- Implementation of Baseline for Scene Text-to-Scene Text Translation☆17Updated 4 months ago
- A desktop compatible version of the Defog app☆14Updated last year
- ☆16Updated last year
- A Mixed Sample Data Augmentation method for Training with Time-Frequency Domain Features☆10Updated 2 years ago
- In this project, I used Decision Tree Learning Model as the main algorithm to build the model. Due to the big amount of flight data, we i…☆12Updated 3 years ago
- Dataset accompanying the paper titled "Pothole detection and dimension estimation system using deep learning (YOLO) and image processing"☆11Updated 2 years ago
- Making More of Little Data: Improving Low-Resource Automatic Speech Recognition Using Data Augmentation☆16Updated 2 years ago
- WebRTC-based real-time audio streaming with Faster Whisper ASR integration for live speech-to-text transcription.☆12Updated 11 months ago
- ☆11Updated 4 years ago
- Rate-Adaptive Quantization: A Multi-Rate Codebook Adaptation for Vector Quantization-based Generative Models☆13Updated 6 months ago
- A composition of offline tools to achieve high quality multilingual speech to text transcription☆19Updated 2 months ago
- Enabling the use of multiple modalities while prompting Stable Diffusion☆15Updated 2 years ago
- Legible, Scalable, Reproducible Foundation Models with Named Tensors and Jax☆14Updated last year
- ☆18Updated 2 years ago
- A curated list of recent efficient video generation methods.☆21Updated this week
- A complete end-to-end Deep Learning system to generate high quality human like speech in English for Korean Drama (WIP)☆13Updated 2 years ago
- We archive data because we are interested in the diffs. All data is from https://video-api.cartoonnetwork.com. We run the check every min…☆10Updated this week
- DoyenTalker uses deep learning techniques to generate personalized avatar videos that speak user-provided text in a specified voice. The …☆12Updated 11 months ago
- This is the official implementation for the paper: Ferret: Federated Full-Parameter Tuning at Scale for Large Language Models☆15Updated 11 months ago
- This is a repository for the course "From Beginner to LLM Developer" by Towards AI.☆11Updated 7 months ago
- Towards Understanding the Mixture-of-Experts Layer in Deep Learning☆31Updated last year