forbes110 / PLEDGE--Paragraph-LEvel-image-Description-GEnerationLinks
Apply an end-to-end model structure (ViT + GPT) to describe images in more detail, rather than traditional image captioning that only provides object detections or a few simple sentences.
☆11Updated last year
Alternatives and similar repositories for PLEDGE--Paragraph-LEvel-image-Description-GEneration
Users that are interested in PLEDGE--Paragraph-LEvel-image-Description-GEneration are comparing it to the libraries listed below
Sorting:
- This project predicts wind turbine failure using numerous sensor data by applying classification based ML models that improves prediction…☆10Updated 2 years ago
- TrustAi website☆12Updated last year
- Simple, Unified Repository for Retrieval-based Voice Conversion☆17Updated last year
- Deep metric learning: Triplet, Magnet and VMF loss☆11Updated 3 years ago
- Optimal Planning for NTU YouBike Assignment with Operation Research and Machine Learning Techniques☆10Updated last year
- A Mixed Sample Data Augmentation method for Training with Time-Frequency Domain Features☆10Updated 3 years ago
- ☆14Updated 2 years ago
- A curated list of resources in audio visual question answering and related area. :-)☆17Updated 7 months ago
- Pytorch Implementation of the Explainable Conditional Adversarial Autoencoder using Saliency Maps and SHAP (J. of Imaging - MDPI)☆12Updated 10 months ago
- Apply pre-trained models to help quickly grasp investment news, including three tasks, 1. summarizationm 2. sentiment analysis 3. domain …☆13Updated last year
- Cantonese Selfish Project 廣東話自肥企劃 at PYCON HK 2021☆15Updated 3 years ago
- I created some notebooks about different concepts of financial engineering☆10Updated 4 months ago
- Scripts, data and researches related to cow weight and breed prediction☆13Updated 5 months ago
- KABooks is a tool to automate the process of creating datasets for training Text-To-Speech (TTS) and Speech-To-Text (STT) models. Using a…☆12Updated 2 years ago
- A Python neural network made with TensorFlow that converts one person's voice into another.☆10Updated 5 years ago
- The offical code of "Parameter-Efficient Learning for Text-to-Speech Accent Adaptation"