ExplainableML/TCAF-GZSL

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/ExplainableML/TCAF-GZSL)

ExplainableML / TCAF-GZSL

This repository contains the code for our ECCV 2022 paper "Temporal and cross-modal attention for audio-visual zero-shot learning"

☆25

Alternatives and similar repositories for TCAF-GZSL

Users that are interested in TCAF-GZSL are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

ExplainableML / AVCA-GZSL
View on GitHub
This repository contains the code for our CVPR 2022 paper on "Audio-visual Generalised Zero-shot Learning with Cross-modal Attention and …
☆42Nov 29, 2022Updated 3 years ago
ExplainableML / ZerAuCap
View on GitHub
[NeurIPS 2023 - ML for Audio Workshop (Oral)] Zero-shot audio captioning with audio-language model guidance and audio context keywords
☆19Nov 30, 2024Updated last year
akoepke / audio-retrieval-benchmark
View on GitHub
Code for "Audio Retrieval with Natural Language Queries: A Benchmark Study", Transactions on Multimedia 2022
☆54Jul 16, 2025Updated last year
ExplainableML / CLEVR-X
View on GitHub
CLEVR-X: A Visual Reasoning Dataset for Natural Language Explanations
☆30Oct 27, 2023Updated 2 years ago
GeWu-Lab / MMCosine_ICASSP23
View on GitHub
The code repo for ICASSP 2023 Paper "MMCosine: Multi-Modal Cosine Loss Towards Balanced Audio-Visual Fine-Grained Learning"
☆26May 18, 2023Updated 3 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
visipedia / ssw60
View on GitHub
Sapsucker Woods 60 Audiovisual Dataset
☆19Oct 7, 2022Updated 3 years ago
hbdat / iccv21_relational_direction
View on GitHub
Interaction Compass: Multi-Label Zero-Shot Learning of Human-Object Interactions via Spatial Relations @ ICCV21
☆13Jul 15, 2022Updated 4 years ago
kyuyeonpooh / objects-that-sound
View on GitHub
The unofficial implementation of paper, "Objects that Sound", from ECCV 2018.
☆31Jan 29, 2024Updated 2 years ago
ExplainableML / ImageFreeZSL
View on GitHub
☆18Oct 5, 2024Updated last year
Bizilizi / VGGSounder
View on GitHub
VGGSounder, a multi-label audio-visual classification dataset with modality annotations.
☆17Jun 30, 2026Updated 3 weeks ago
oncescuandreea / audio-retrieval
View on GitHub
Implementation of "Audio Retrieval with Natural Language Queries", INTERSPEECH 2021, PyTorch
☆26Aug 18, 2023Updated 2 years ago
zhaoyanpeng / audioset-dl
View on GitHub
Download AudioSet for Vision-Audio-Text Pre-training
☆13May 16, 2022Updated 4 years ago
GeWu-Lab / CSOL_TPAMI2021
View on GitHub
The repo for "Class-aware Sounding Objects Localization", TPAMI 2021.
☆29Mar 4, 2022Updated 4 years ago
JunlinHan / MachineMem
View on GitHub
Code of "What Images are More Memorable to Machines?"
☆15Feb 13, 2023Updated 3 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
RenzKa / VIA_sign-language-annotation
View on GitHub
VIA modification for sign language annotation
☆18Apr 30, 2021Updated 5 years ago
andresperezEUT / ambiscaper
View on GitHub
Ambiscaper: a tool for automatic dataset generation and annotation of reverberant Ambisonics audio. Originally forked from http://github.…
☆22Sep 14, 2018Updated 7 years ago
seongkyun / pytorch-classifications
View on GitHub
Pytorch classification with Cifar-10, Cifar-100, and STL-10
☆14Jul 24, 2019Updated 6 years ago
hendriks73 / directional_cnns
View on GitHub
Source code repository for the SMC paper "Musical Tempo and Key Estimation using Convolutional Neural Networks with Directional Filters".
☆33Mar 24, 2023Updated 3 years ago
agsarthak / Goal-oriented-Dialogue-Systems
View on GitHub
Applying Deep Reinforcement Learning for dialogue generation. aka chatbot
☆13Apr 30, 2017Updated 9 years ago
NoManNayeem / Langchain_CrewAI_Gemini-AI_Agents
View on GitHub
Langchain_CrewAI_Gemini - An Gemini AI powered AI Agent (Multi-Agent) Project.
☆14Mar 24, 2024Updated 2 years ago
csiro-icvg / Diff3DHPE
View on GitHub
Diff3DHPE: A Diffusion Model for 3D Human Pose Estimation [R6D 2023] [Official]
☆15May 23, 2024Updated 2 years ago
czifan / RAML
View on GitHub
☆15Dec 13, 2022Updated 3 years ago
ZurichRain / HMCGR
View on GitHub
code for COLING paper "A Hybrid Model of Classification and Generation for Spatial Relation Extraction"
☆10Oct 20, 2022Updated 3 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
atultiwari / LLaVA-Med
View on GitHub
Large Language-and-Vision Assistant for BioMedicine, built towards multimodal GPT-4 level capabilities.
☆10Nov 29, 2023Updated 2 years ago
Out-of-Distribution-Generalization / Out-of-Distribution-Generalization.github.io
View on GitHub
☆27Feb 2, 2023Updated 3 years ago
eliasgoldsztejn95 / PTDRL
View on GitHub
Hospital simulator with pedestrians and robot
☆15Oct 20, 2024Updated last year
sangho-vision / acav100m
View on GitHub
ACAV100M: Automatic Curation of Large-Scale Datasets for Audio-Visual Video Representation Learning. In ICCV, 2021.
☆64Nov 18, 2021Updated 4 years ago
PengWan-Yang / commonLocalization
View on GitHub
☆17Nov 5, 2020Updated 5 years ago
gulvarol / bsldict
View on GitHub
Watch, read and lookup: learning to spot signs from multiple supervisors, ACCV 2020 (Best Application Paper)
☆34Apr 10, 2023Updated 3 years ago
kaishxu / DFMed
View on GitHub
Code and data for "Medical Dialogue Generation via Dual Flow Modeling" (ACL 2023 Findings)
☆14Nov 22, 2023Updated 2 years ago
uqzhichen / Awesome-compositional-zero-shot-learning
View on GitHub
Paper list of compositional zero-shot learning
☆11Jul 5, 2022Updated 4 years ago
tan90xx / distillw2n
View on GitHub
🤫A Lightweight One-Shot Whisper to Normal Voice Conversion Model Using Distillation of Self-Supervised Features
☆26Dec 10, 2025Updated 7 months ago
Simple, predictable pricing with DigitalOcean hosting • Ad
Always know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
stefanhgm / patient_summaries_with_llms
View on GitHub
Code for "A Data-Centric Approach To Generate Faithful and High Quality Patient Summaries with Large Language Models"
☆17Jul 20, 2025Updated last year
zhongpeixiang / affect-rich-conversational-model
View on GitHub
The PyTorch code for paper: An Affect-Rich Neural Conversational Model with Biased Attention and Weighted Cross-Entropy Loss
☆12Oct 7, 2019Updated 6 years ago
donghoney0416 / DeFTAN-II
View on GitHub
Official page of "DeFTAN-II: Efficient multichannel speech enhancement with subgroup processing", IEEE/ACM Transactions on Audio, Speech,…
☆34Nov 21, 2024Updated last year
eujhwang / personalized-llms
View on GitHub
personalized-llms with allen institute
☆13Jun 22, 2023Updated 3 years ago
ZuoJiaxing / monother_depth
View on GitHub
Code released for paper titled "MonoTher-Depth: Enhancing Thermal Depth Estimation via Confidence-Aware Distillation"
☆19Sep 22, 2025Updated 9 months ago
roudimit / AVLnet
View on GitHub
Code for the AVLnet (Interspeech 2021) and Cascaded Multilingual (Interspeech 2021) papers.
☆54Mar 30, 2022Updated 4 years ago
caravanuden / cardio
View on GitHub
Cardiovascular disease dataset analysis for Data Science for Health (COSC 89.20)
☆24May 10, 2019Updated 7 years ago