[ICIP 2022 oral] VLCap: Vision-Language with Contrastive Learning for Coherent Video Paragraph Captioning
☆28Jun 28, 2023Updated 2 years ago
Alternatives and similar repositories for VLCAP
Users that are interested in VLCAP are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆20Nov 11, 2024Updated last year
- [AAAI 2023 Oral] VLTinT: Visual-Linguistic Transformer-in-Transformer for Coherent Video Paragraph Captioning☆68Feb 16, 2024Updated 2 years ago
- [Lab] lab website☆11Mar 23, 2026Updated 3 weeks ago
- [IJCV] AOE-Net: Entities Interactions Modeling with Adaptive Attention Mechanism for Temporal Action Proposals Generation☆20Jul 2, 2024Updated last year
- [ICRA 2024 Oral] Open-Fusion: Real-time Open-Vocabulary 3D Mapping and Queryable Scene Representation☆147Aug 19, 2024Updated last year
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- [ICPR 2022] 3DConvCaps: 3DUnet with Convolutional Capsule Encoder for Medical Image Segmentation☆47Jun 26, 2022Updated 3 years ago
- [ISBI 2024] An implementation of TSRNet for ECG Anomaly Detection☆24Apr 11, 2024Updated 2 years ago
- [BMVC 2022] AISFormer: Amodal Instance Segmentation with Transformer☆44Nov 24, 2024Updated last year
- The repository of the ACCV 2024 paper "FG-CXR: A Radiologist-Aligned Gaze Dataset for Enhancing Interpretability in Chest X-Ray Report Ge…☆11Jul 28, 2025Updated 8 months ago
- [NeurIPS 2023] Official Implementation of "PaintSeg: Painting Pixels for Training-free Segmentation"☆14Dec 31, 2023Updated 2 years ago
- [ISBI 2024] An implementation of SAM3D which adapts Segment Anything Model for Volumetric Medical Image Segmentation☆83May 28, 2024Updated last year
- Video Feature Extractor for S3D-HowTo100M☆29Apr 30, 2021Updated 4 years ago
- ☆16Dec 4, 2025Updated 4 months ago
- [Official Implementation] Acoustic Autoregressive Modeling 🔥☆75Aug 24, 2024Updated last year
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Explaining audio differences using language☆16Feb 11, 2025Updated last year
- ☆18Nov 23, 2022Updated 3 years ago
- [CVPR 2024] "Towards Robust Audiovisual Segmentation in Complex Environments with Quantization-based Semantic Decomposition"☆12Feb 27, 2024Updated 2 years ago
- Code for the paper: MACE: Leveraging Audio for Evaluating Audio Captioning Systems☆13Jan 16, 2025Updated last year
- Copy-paste augmentation in detectron2 pipeline☆35Mar 25, 2021Updated 5 years ago
- Qualia is a deep learning framework deeply integrated with automatic differentiation and dynamic graphing with CUDA acceleration. Qualia …☆47Apr 6, 2021Updated 5 years ago
- Generative Models for Image Captioning☆10Jun 7, 2017Updated 8 years ago
- Adapted from the widely used project webpage template made by the colorful folks.☆42Aug 8, 2021Updated 4 years ago
- Re-thinking Co-Salient Object Detection, TPAMI 2021☆24Jan 26, 2023Updated 3 years ago
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- ☆10Oct 7, 2023Updated 2 years ago
- Detectron2 is a platform for object detection, segmentation and other visual recognition tasks.☆19May 7, 2022Updated 3 years ago
- [TMM 2021] PiSLTRc: Position-informed Sign Language Transformer with Content-aware Convolution☆11Dec 9, 2021Updated 4 years ago
- ☆13Jun 26, 2022Updated 3 years ago
- Code for "Time-Aware Auto White Balance in Mobile Photography"☆28Jan 25, 2026Updated 2 months ago
- This is tensorflow 2.2 based SCAMET framework for remote sensing image captioning.☆13Aug 10, 2023Updated 2 years ago
- ☆15Nov 19, 2020Updated 5 years ago
- EDUVSUM is a multimodal neural architecture that utilizes state-of-the-art audio, visual and textual features to identify important tempo…☆23Mar 8, 2024Updated 2 years ago
- Code and data for experiments on semantic fragments☆11Jun 23, 2022Updated 3 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- A simple and effective feature extractor for untrimmed videos☆13Sep 1, 2022Updated 3 years ago
- Video-Text Representation Learning via Differentiable Weak Temporal Alignment (PyTorch implementation for the CVPR 2022 paper)☆11Oct 12, 2022Updated 3 years ago
- [ACL 2020] PyTorch code for MART: Memory-Augmented Recurrent Transformer for Coherent Video Paragraph Captioning☆171Dec 4, 2020Updated 5 years ago
- ☆19Dec 8, 2024Updated last year
- [AAAI 2025] Generative Planning with 3D-vision Language Pre-training for End-to-End Autonomous Driving☆52Mar 4, 2026Updated last month
- ☆12May 26, 2023Updated 2 years ago
- ☆11Sep 15, 2023Updated 2 years ago