Code release for "VoLTA: Vision-Language Transformer with Weakly-Supervised Local-Feature Alignment" [TMLR, 2023]
☆12Dec 9, 2023Updated 2 years ago
Alternatives and similar repositories for VoLTA
Users that are interested in VoLTA are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Code release for "EgoVLPv2: Egocentric Video-Language Pre-training with Fusion in the Backbone" [ICCV, 2023]☆110Jul 2, 2024Updated 2 years ago
- ☆24Jun 12, 2024Updated 2 years ago
- Code release for "SPIQA: A Dataset for Multimodal Question Answering on Scientific Papers" [NeurIPS D&B, 2024]☆75Jan 13, 2025Updated last year
- Implementation of the paper 'Sentence Bottleneck Autoencoders from Transformer Language Models'☆17Mar 14, 2022Updated 4 years ago
- In this repository we have all the codes that we have developed☆12Sep 13, 2023Updated 2 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- [CVPR 2025] PyTorch implementation of T-CORE, introduced in "When the Future Becomes the Past: Taming Temporal Correspondence for Self-su…☆19Nov 4, 2025Updated 7 months ago
- A detection/segmentation dataset with labels characterized by intricate and flexible expressions. "Described Object Detection: Liberating…☆137Mar 20, 2024Updated 2 years ago
- ☆35Jul 9, 2025Updated 11 months ago
- [ECCV'2024] HERGen: Elevating Radiology Report Generation with Longitudinal Data☆31Jan 25, 2026Updated 5 months ago
- [ICML 2024] Visual-Text Cross Alignment: Refining the Similarity Score in Vision-Language Models☆19Mar 23, 2026Updated 3 months ago
- A python implement for Certifiable Robust Multi-modal Training☆20Jun 21, 2025Updated last year
- [ICRA 2025] A Parameter-Efficient Tuning Framework for Language-guided Object Grounding and Robot Grasping☆12Feb 7, 2025Updated last year
- ☆10Oct 27, 2020Updated 5 years ago
- A framework for Longitudinal Radiology Report Generation☆32Aug 10, 2024Updated last year
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- ☆56Nov 1, 2024Updated last year
- Task-adaptive Spatial-Temporal Video Sampler for Few-shot Action Recognition☆14Dec 22, 2022Updated 3 years ago
- Unsupervised Domain Adaptation of MRI Skull-stripping Trained on Adult Data to Newborns☆11Jan 26, 2026Updated 5 months ago
- Official pytorch repository for "Knowing Where to Focus: Event-aware Transformer for Video Grounding" (ICCV 2023)☆55Sep 7, 2023Updated 2 years ago
- (NeurIPS2023) CoDet: Co-Occurrence Guided Region-Word Alignment for Open-Vocabulary Object Detection☆123Apr 26, 2024Updated 2 years ago
- the public repo for stats205 scribe notes at Stanford University☆14Jun 10, 2021Updated 5 years ago
- Symile is a flexible, architecture-agnostic contrastive loss that enables training modality-specific representations for any number of mo…☆53Mar 25, 2025Updated last year
- Code and models for MICCAI23 paper: "Self-Supervised Learning for Endoscopy Video Analysis".☆24Oct 2, 2023Updated 2 years ago
- TabMap for high-performance tabular data analysis - Nature BME☆20Jan 8, 2025Updated last year
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- ☆22Sep 19, 2025Updated 9 months ago
- OpenThinkIMG is an end-to-end open-source framework that empowers Large Vision-Language Models to think with images.☆123Jul 11, 2025Updated 11 months ago
- Code for "Multi-Time Attention Networks for Irregularly Sampled Time Series", ICLR 2021.☆144Jun 8, 2021Updated 5 years ago
- Failures in machine learning for medical imaging☆32Feb 15, 2022Updated 4 years ago
- Official Implementation of "Chrono: A Simple Blueprint for Representing Time in MLLMs"☆95Mar 9, 2025Updated last year
- Boltzmann Attention Sampling for Image Analysis with Small Objects☆36Apr 21, 2026Updated 2 months ago
- Some time series vectorization methods which could give better representation for classification / clustering or other analysis.☆11Jan 4, 2016Updated 10 years ago
- ViLMedic (Vision-and-Language medical research) is a modular framework for vision and language multimodal research in the medical field☆189Oct 9, 2025Updated 8 months ago
- PyTorch code corresponding to my blog series on adversarial examples and (confidence-calibrated) adversarial training.☆67Apr 26, 2023Updated 3 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- end-to-end voicebot that answers open domain questions.☆10Oct 23, 2021Updated 4 years ago
- P2P Encrypted File Sharing In Your Browser via WebRTC☆37Dec 31, 2017Updated 8 years ago
- node.js based CMS , built on top of grapejs framework☆12Sep 8, 2016Updated 9 years ago
- a toy duckdb based timeseries database☆15Sep 30, 2020Updated 5 years ago
- AAAI2023 Reducing Domain Gap in Frequency and Spatial domain for Cross-modality Domain Adaptation on Medical Image Segmentation☆29Sep 19, 2023Updated 2 years ago
- The more often you click a word in the headlines, the more interesting are your news.☆13Mar 27, 2017Updated 9 years ago
- Docker for running stroke lesion core segmentation☆31Dec 15, 2020Updated 5 years ago