aimh-lab / visione
An AI-powered interactive video retrieval system
☆15Updated last week
Related projects: ⓘ
- [Thesis'24] Efficient Class Incremental Learning for Object Detection☆12Updated 2 months ago
- AICITY2024 Track 2 - Code from AIO_ISC Team☆27Updated 2 months ago
- Code release for "VoLTA: Vision-Language Transformer with Weakly-Supervised Local-Feature Alignment" [TMLR, 2023]☆13Updated 9 months ago
- Pioneering in Vietnamese Multimodal Large Language Model☆37Updated last month
- ☆23Updated 2 months ago
- SSLCL: An Efficient Model-Agnostic Supervised Contrastive Learning Framework for Emotion Recognition in Conversations☆9Updated last month
- BED-AIO team code for AIChallenge2023☆39Updated last month
- LIME-SAM aims to create an Explainable Artificial Intelligence (XAI) framework for image classification using LIME (Local Interpretable M…☆32Updated last year
- Archive of Tasks and Results of the Video Browser Showdown☆11Updated last month
- ☆23Updated last year
- VLSP2021 vieCap4H Challenge: Automatic image caption generation for healthcare domains in Vietnamese☆10Updated last year
- AIO Research Agent - an all-in-one intelligent companion for navigating the academic world.☆30Updated 2 months ago
- ☆12Updated 10 months ago
- A simple PyTorch implementation of the Representation Learning via Invariant Causal Mechanisms self-supervised contrastive learning paper☆10Updated 5 months ago
- Object detection with Satellite Images☆10Updated last year
- Baseline for ZaloAI Challenge 2023 Elementary Math Solving☆67Updated 7 months ago
- ViSoBERT: A Pre-Trained Language Model for Vietnamese Social Media Text Processing (EMNLP'2023)☆0Updated 3 weeks ago
- A collection of Vietnamese women who are currently working in the field of Computer Science.☆11Updated last month
- Information Retrieval from Audio via Knowledge Graph☆85Updated last month
- General template for most Pytorch projects☆34Updated last week
- Multilingual Multitask Multipurpose Medical Speech Recognition☆79Updated last month
- ☆21Updated 11 months ago
- ☆31Updated 2 weeks ago
- [ICCV 2023] Simple Baselines for Interactive Video Retrieval with Questions and Answers☆11Updated 5 months ago
- [CVPRW 2024] TrafficVLM: A Controllable Visual Language Model for Traffic Video Captioning. Official code for the 3rd place solution of t…☆28Updated 3 months ago
- This is the official repository for Vista dataset - A Vietnamese multimodal dataset contains more than 700,000 samples of conversations a…☆23Updated 4 months ago
- Revisiting Image Captioning Training Paradigm via Direct CLIP-based Optimization (BMVC 2024 Oral ✨)☆10Updated last week
- ☆18Updated 3 months ago
- Top Picks for Data Science Self-Study: From Newbies to Pros!☆11Updated 5 months ago
- [CVPR2024] Learning CNN on ViT: A Hybrid Model to Explicitly Class-specific Boundaries for Domain Adaptation☆30Updated 2 weeks ago