forbes110 / PLEDGE--Paragraph-LEvel-image-Description-GEneration
Apply an end-to-end model structure (ViT + GPT) to describe images in more detail, rather than traditional image captioning that only provides object detections or a few simple sentences.
☆11Updated 2 months ago
Alternatives and similar repositories for PLEDGE--Paragraph-LEvel-image-Description-GEneration:
Users that are interested in PLEDGE--Paragraph-LEvel-image-Description-GEneration are comparing it to the libraries listed below
- TrustAi website☆12Updated 7 months ago
- Optimal Planning for NTU YouBike Assignment with Operation Research and Machine Learning Techniques☆10Updated 7 months ago
- Apply pre-trained models to help quickly grasp investment news, including three tasks, 1. summarizationm 2. sentiment analysis 3. domain …☆13Updated 7 months ago
- Unofficial Implementation of paper "On Entropy Approximation for Gaussian Mixture Random Vectors" in python☆25Updated 7 months ago
- Latex Workshop (2024 Spring)☆11Updated 5 months ago
- Python 金融市場賺大錢聖經:寫出你的專屬指標 - 進階技術補充☆14Updated 3 years ago
- Textless (ASR-transcript free) Spoken Question Answering. The official release of NMSQA dataset and the implementation of "DUAL: Textless…☆35Updated last year
- Cantonese Selfish Project 廣東話自肥企劃 at PYCON HK 2021☆15Updated 3 years ago
- The offical code of "Parameter-Efficient Learning for Text-to-Speech Accent Adaptation"☆13Updated last year
- ☆10Updated last month
- Code for paper "Unsupervised Noise adaptation using Data Simulation"☆12Updated 10 months ago
- Python 金融市場賺大錢聖經:寫出你的專屬指標☆66Updated 3 weeks ago
- This is an example of 50 alphas that can pass the correlation test if they are submitted together.☆33Updated last year
- ☆17Updated 4 years ago
- A Mixed Sample Data Augmentation method for Training with Time-Frequency Domain Features☆11Updated 2 years ago
- ☆14Updated last year
- Code and Data for M3A: Multimodal Multi-speaker Mergers & Acquisitions at ACL-IJCNLP 2021 (main)☆16Updated 3 years ago
- The MOS system combines components from DNSMOS, NISQA, MOSSSL, and SIGMOS, using the librosa library to process audio waveforms.☆20Updated last year
- Token and sentence level embeddings from FinBERT model (Finance Domain)☆38Updated last year
- 都會阿嬤 Stocker tutorial☆30Updated 2 years ago
- An implementation of the paper titled "Arabic Speech Emotion Recognition Employing Wav2vec2.0 and HuBERT Based on BAVED Dataset" https://…☆13Updated 3 years ago
- Mispronunciation Detection using a pretrained and finetuned wav2vec2 model for phoneme recognition and diagnosis and feedback using large…☆16Updated 10 months ago
- This repository focuses on leveraging OpenAI's Whisper model for speech recognition in Chinese (Mandarin) and Taiwanese Hokkien languages…☆35Updated last month
- This is an example of 50 alphas that can pass the correlation test if they are submitted together.☆25Updated last year
- ☆10Updated 3 years ago
- FinABSA is a T5-Large model trained for Aspect-Based Sentiment Analysis specifically for financial domains.☆30Updated last year
- Deepfake cross-lingual evaluation dataset (DECRO) is constructed to evaluate the influence of language differences on deepfake detection.…☆11Updated last year
- 丁建均老師的"時頻分析和小波轉換"作業(TFW)☆9Updated last year
- Neural Networks to predict stock price☆39Updated 4 years ago
- Technical Analysis on Cryptocurrency☆23Updated 11 months ago