kaylode / vqa-transformer
Visual Question Answering using Transformer and Bottom-Up attention. Implemented in Pytorch
☆10Updated 3 years ago
Related projects ⓘ
Alternatives and complementary repositories for vqa-transformer
- Image captioning with Transformer☆15Updated 3 years ago
- General template for most Pytorch projects☆34Updated 2 months ago
- General template for my PyTorch projects.☆18Updated 2 years ago
- A strong baseline for liveness detection. The source code could be used for similar tasks, such as face anti-spoofing or detecting fake v…☆20Updated last year
- Easy-to-read implementation of self-supervised learning using vision transformer and knowledge distillation with no labels - DINO☆22Updated last year
- Vietnamese handwritten text recognition system☆17Updated 3 years ago
- A project for the Zalo AI Challenge 2019, Vietnamese Wikipedia Question Answering task.☆16Updated 4 years ago
- A toolbox for Vietnamese Optical Character Recognition.☆105Updated 2 years ago
- GRIT: Faster and Better Image-captioning Transformer (ECCV 2022)☆185Updated last year
- 👨🏻💻 Code release for Vietnamese chatbot from scratch [Published in IEEE IMCOM 2022]☆17Updated 3 months ago
- Repository for Multilingual-VQA task created during HuggingFace JAX/Flax community week.☆34Updated 3 years ago
- 1st place solution for Zalo AI Challenge 2022☆27Updated last year
- ☆12Updated 2 years ago
- Top 1 Quy Nhon AI Hackathon 2022 Challenge Smart Menu☆31Updated 2 years ago
- get familiar with pytorch☆8Updated 3 years ago
- ☆46Updated last year
- VLSP2021 vieCap4H Challenge: Automatic image caption generation for healthcare domains in Vietnamese☆11Updated last year
- This repository contains the official implementation (PyTorch) of "Multimodal Forgery Detection Using Ensemble Learning" proposed in APSI…☆9Updated last year
- Official implementation of POODLE: Improving Few-shot Learning via Penalizing Out-of-Distribution Samples (NeurIPS 2021)☆15Updated 2 years ago
- Text Query based Traffic Video Event Retrieval with Global-Local Fusion Embedding☆12Updated last year
- Zenith training source for Thach Thuc competition that is hosted by HCMUS☆18Updated 3 years ago
- ☆30Updated 2 years ago
- This is the official repository for Vista dataset - A Vietnamese multimodal dataset contains more than 700,000 samples of conversations a…☆24Updated 6 months ago
- Top 2 Solution for Zalo AI Challenge 2022 - Liveness Detection track☆44Updated last year
- Archive of Tasks and Results of the Video Browser Showdown☆11Updated 3 months ago
- Dictionary-guided Scene Text Recognition (CVPR-2021)☆141Updated 3 months ago
- ☆61Updated 3 years ago
- Runner-up team (2nd place) in AI4VN2022: Air Quality Forcasting Challenge☆32Updated last year
- [Thesis'24] Efficient Class Incremental Learning for Object Detection☆14Updated 4 months ago
- Zalo AI Challenge 2020: News Summarization - Runner-up solution☆20Updated 3 years ago