neilfei / brivl-nmiLinks
☆60Updated 3 years ago
Alternatives and similar repositories for brivl-nmi
Users that are interested in brivl-nmi are comparing it to the libraries listed below
Sorting:
- A curated list of vision-and-language pre-training (VLP). :-)☆59Updated 3 years ago
- A repo for REMOD: relation extraction algorithm based on multimodality knowledge distillation☆28Updated 3 years ago
- UNIMO: Towards Unified-Modal Understanding and Generation via Cross-Modal Contrastive Learning☆70Updated 4 years ago
- pytorch implementation of mvp: a multi-stage vision-language pre-training framework☆33Updated 2 years ago
- Official code for ICLR 2022 paper: "PoNet: Pooling Network for Efficient Token Mixing in Long Sequences".☆32Updated 2 years ago
- 简单的挖矿病毒查杀脚本☆17Updated 3 years ago
- The codes and dataset for the semantic explainable AI (S-XAI)☆15Updated 2 years ago
- 基于Gated Attention Unit的Transformer模型(尝鲜版)☆98Updated 2 years ago
- Southeast University Knowledge Graph-OpenRichpedia☆38Updated 3 years ago
- Bling's Object detection tool☆56Updated 2 years ago
- 基于Transformer的单模型、多尺度的VAE模型☆57Updated 4 years ago
- A *tuned* minimal PyTorch re-implementation of the OpenAI GPT (Generative Pretrained Transformer) training☆115Updated 3 years ago
- ☆73Updated 3 years ago
- CrossCLR: Cross-modal Contrastive Learning For Multi-modal Video Representations, ICCV 2021☆64Updated 3 years ago
- [TMLR 2022] High-Modality Multimodal Transformer☆116Updated 8 months ago
- The official code for paper "EasyGen: Easing Multimodal Generation with a Bidirectional Conditional Diffusion Model and LLMs"☆74Updated 7 months ago
- Code for EMNLP 2022 paper “Distilled Dual-Encoder Model for Vision-Language Understanding”☆30Updated 2 years ago
- Code and data to accompany the camera-ready version of "Cross-Attention is All You Need: Adapting Pretrained Transformers for Machine Tra…☆32Updated 3 years ago
- the world's first large-scale multi-modal short-video encyclopedia, where the primitive units are items, aspects, and short videos.☆62Updated last year
- Source code for EMNLP2022 long paper: Parameter-Efficient Tuning Makes a Good Classification Head☆14Updated 2 years ago
- This repository provides a comprehensive collection of research papers focused on multimodal representation learning, all of which have b…☆76Updated 3 weeks ago
- Pre-trained Language Model for Scientific Text☆45Updated last year
- Policies of scientific publisher and conferences towards large language model (LLM), such as ChatGPT☆74Updated 2 years ago
- ☆22Updated 2 years ago
- A visualizer to display attention weights on text☆23Updated 6 years ago
- TagGPT: Large Language Models are Zero-shot Multimodal Taggers☆63Updated 2 years ago
- Official code for the NAACL 2022 paper "Fuse It More Deeply! A Variational Transformer with Layer-Wise Latent Variable Inference for Text…☆35Updated 2 years ago
- Implementation of the Benchmark Approaches for Medical Instructional Video Classification (MedVidCL) and Medical Video Question Answering…☆27Updated 2 years ago
- SimVLM ---SIMPLE VISUAL LANGUAGE MODEL PRETRAINING WITH WEAK SUPERVISION☆36Updated 2 years ago
- custom pytorch implementation of MoCo v3☆46Updated 4 years ago