Audio-Visual Generalized Zero-Shot Learning using Large Pre-Trained Models
☆23Apr 15, 2024Updated 2 years ago
Alternatives and similar repositories for ClipClap-GZSL
Users that are interested in ClipClap-GZSL are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- This repository contains the code for our CVPR 2022 paper on "Audio-visual Generalised Zero-shot Learning with Cross-modal Attention and …☆42Nov 29, 2022Updated 3 years ago
- ☆11Apr 12, 2024Updated 2 years ago
- Rainbow Keywords - Official PyTorch Implementation☆14Jun 27, 2024Updated last year
- Demo page of TAVGBench: Benchmarking Text to Audible-Video Generation☆15Apr 7, 2025Updated last year
- This repository is for The Power of Sound(TPoS): Audio Reactive Video Generation with Stable Diffusion (ICCV2023)☆25Dec 7, 2023Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- An unofficial (PyTorch) implementation for the paper Deep Lip Reading: A comparison of models and an online application.☆10May 13, 2020Updated 5 years ago
- to release the source code for reproducing the results reported in our paper: https://arxiv.org/abs/2409.17550☆14Nov 15, 2024Updated last year
- [AAAI 2024] AVSegFormer: Audio-Visual Segmentation with Transformer☆74Mar 6, 2025Updated last year
- ☆23Mar 20, 2024Updated 2 years ago
- Official Code Repository for the paper "Generating Realistic Images from In-the-wild Sounds", ICCV 2023☆12Aug 24, 2025Updated 8 months ago
- Source codes for the paper "Personalized Dynamic Music Emotion Recognition with Dual-Scale Attention-Based Meta-Learning" (PDMER) which p…☆13Mar 24, 2025Updated last year
- [CVPR 2023] Official implementation of our paper - Learning Audio-Visual Source Localization via False Negative Aware Contrastive Learnin…☆27Apr 10, 2023Updated 3 years ago
- Official code for WACV 2024 paper, "Annotation-free Audio-Visual Segmentation"☆38Oct 11, 2024Updated last year
- ☆14Nov 13, 2023Updated 2 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Official code for the paper: [ICCV2023] Sound Localization from Motion: Jointly Learning Sound Direction and Camera Rotation☆42Dec 23, 2023Updated 2 years ago
- 采样频率12kHz的驱动端故障数据--CWRU凯斯西储大学故障诊断实验数据集☆10May 31, 2023Updated 2 years ago
- The MAVD represents Mandarin Audio-Visual dataset with Depth information. MAVD has a rich variety of modal data, including audio, RGB ima…☆20Apr 22, 2024Updated 2 years ago
- Details of the datasets for Few-shot class-incremental audio classification☆10Dec 6, 2023Updated 2 years ago
- An implementation of http://openaccess.thecvf.com/content_CVPRW_2019/papers/Sight%20and%20Sound/Konstantinos_Vougioukas_End-to-End_Speech…☆18Mar 19, 2020Updated 6 years ago
- ☆40Apr 14, 2025Updated last year
- Code for CLVision workshop (CVPR 2024) paper - Calibrating Higher-Order Statistics for Few-Shot Class-Incremental Learning with Pre-train…☆11Nov 12, 2024Updated last year
- ☆26Jul 15, 2024Updated last year
- Text Clustering as Classification with LLMs☆18Oct 2, 2025Updated 7 months ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- This is my speaker recognition implementation based on the x-vector system described in "X-Vectors: Robust DNN Embeddings for Speaker Rec…☆10Nov 3, 2022Updated 3 years ago
- Personalized Lip Reading: Adapting to Your Unique Lip Movements with Vision and Language (AAAI 2025)☆24Mar 17, 2025Updated last year
- Baseline system for CNVSRC2023 (Chinese Continuous Visual Speech Recognition Challenge 2023)☆23Apr 27, 2024Updated 2 years ago
- Official Implementation of "NeRI: Implicit Neural Representation Of LiDAR Point Cloud Using Range Image Sequence"☆14Mar 7, 2024Updated 2 years ago
- [ICCV2023] CoTDet: Affordance Knowledge Prompting for Task Driven Object Detection☆19Apr 23, 2025Updated last year
- Simple i2c example for STM32F4: scan the i2c bus for connected devices☆11Jul 15, 2015Updated 10 years ago
- Chameleon: A Multiplier-Free Temporal Convolutional Network Accelerator for End-to-End Few-Shot and Continual Learning from Sequential Da…☆27Mar 5, 2026Updated 2 months ago
- Spiking CNN for object recognition☆12Apr 26, 2017Updated 9 years ago
- ☆27Jun 27, 2023Updated 2 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- imu gps fuse use eskf☆11Jan 12, 2023Updated 3 years ago
- An implement of ORB-SLAM3 with python.☆10Jul 2, 2023Updated 2 years ago
- This repo contains the official PyTorch implementation of: Diverse and Aligned Audio-to-Video Generation via Text-to-Video Model Adaptati…☆128Feb 13, 2025Updated last year
- Official PyTorch implementation of "Conditional Generation of Audio from Video via Foley Analogies".☆93Dec 8, 2023Updated 2 years ago
- Official PyTorch implementation of SGEM: Test-Time Adaptation for Automatic Speech Recognition via Sequential-Level Generalized Entropy M…☆38Aug 27, 2024Updated last year
- [ICASSP'22] Continual Learning Benchmark for Spoken Keyword Spotting☆17Jun 7, 2022Updated 3 years ago
- ☆17Mar 23, 2020Updated 6 years ago