EricLee8 / MPD_EMVI
Official implementation of our paper at ACL 2023: Pre-training Multi-party Dialogue Models with Latent Discourse Inference
☆10Updated last year
Alternatives and similar repositories for MPD_EMVI
Users that are interested in MPD_EMVI are comparing it to the libraries listed below
Sorting:
- a fully open-source implementation of a GPT-4o-like speech-to-speech video understanding model.☆14Updated last month
- [2024-ACL]: TextBind: Multi-turn Interleaved Multimodal Instruction-following in the Wildrounded Conversation☆47Updated last year
- Fluency ENhanced Sentence-bert Evaluation (FENSE), metric for audio caption evaluation. And Benchmark dataset AudioCaps-Eval, Clotho-Eval…☆21Updated 2 years ago
- Official implementation of AAAI'2022 paper "Regularizing End-to-End Speech Translation with Triangular Decomposition Agreement"☆18Updated 3 years ago
- A project for tri-modal LLM benchmarking and instruction tuning.☆34Updated last month
- The dataset and baseline code for Text-to-Audio Grounding (TAG)☆42Updated 4 months ago
- Code for ACL 2022 main conference paper "STEMM: Self-learning with Speech-text Manifold Mixup for Speech Translation".☆36Updated last year
- Code for ACL 2022 findings paper "Gaussian Multi-head Attention for Simultaneous Machine Translation"☆11Updated 3 years ago
- Code for EMNLP 2022 main conference paper "Low-resource Neural Machine Translation with Cross-modal Alignment".☆14Updated 2 years ago
- [ACM MM 2024] See or Guess: Counterfactually Regularized Image Captioning☆14Updated 3 months ago
- ☆11Updated 2 months ago
- The code and weight for LoVA. LoVA is a novel model for Long-form Video-to-Audio generation. Based on the Diffusion Transformer (DiT) arc…☆13Updated 2 months ago
- Code for Findings of EMNLP2023 paper "Coarse-to-Fine Dual Encoders are Better Frame Identification Learners"☆12Updated last year
- ☆18Updated 11 months ago
- code for paper "Cross-modal Contrastive Learning for Speech Translation" (NAACL 2022)☆64Updated 2 years ago
- Official implementation of our IWSLT 2023 paper "The MineTrans Systems for IWSLT 2023 Offline Speech Translation and Speech-to-Speech Tra…☆15Updated last year
- [ACM MM 2022]: Multi-Modal Experience Inspired AI Creation☆20Updated 5 months ago
- Narrative movie understanding benchmark☆70Updated last year
- ☆21Updated 2 weeks ago
- OpenOmni: Official implementation of Advancing Open-Source Omnimodal Large Language Models with Progressive Multimodal Alignment and Rea…☆49Updated this week
- Data for evaluating GPT-4V☆11Updated last year
- ☆59Updated 2 years ago
- ☆38Updated 9 months ago
- Code for ACL 2022 main conference paper "Neural Machine Translation with Phrase-Level Universal Visual Representations".☆21Updated last year
- [EMNLP 2022] Revisiting Grammatical Error Correction Evaluation and Beyond☆19Updated 2 years ago
- [AAAI 2024] DenoSent: A Denoising Objective for Self-Supervised Sentence Representation Learning☆16Updated last year
- Continual Learning for Task-Oriented Dialogue Systems☆29Updated 3 years ago
- LongMIT: Essential Factors in Crafting Effective Long Context Multi-Hop Instruction Datasets☆36Updated 7 months ago
- [ICLR2024] The official implementation of paper "UniAdapter: Unified Parameter-Efficient Transfer Learning for Cross-modal Modeling", by …☆74Updated last year
- ☆18Updated 10 months ago