roudimit/c2kd

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/roudimit/c2kd)

roudimit / c2kd

Code for the C2KD paper (ICASSP 2023)

☆20

Alternatives and similar repositories for c2kd

Users that are interested in c2kd are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

rxtan2 / video-grounding-narrations
View on GitHub
☆12Mar 12, 2023Updated 3 years ago
jasonppy / FaST-VGS-Family
View on GitHub
Transformer-based visually grounded speech models
☆19Sep 22, 2022Updated 3 years ago
haoheliu / ontology-aware-audio-tagging
View on GitHub
☆14Nov 22, 2022Updated 3 years ago
jasonppy / word-discovery
View on GitHub
Word Discovery in Visually Grounded, Self-Supervised Speech Models
☆27Dec 4, 2023Updated 2 years ago
wnhsu / ResDAVEnet-VQ
View on GitHub
Official codes for the paper "Learning Hierarchical Discrete Linguistic Units from Visually-Grounded Speech"
☆28Feb 22, 2022Updated 4 years ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
oncescuandreea / QuerYD_downloader
View on GitHub
☆23Dec 5, 2023Updated 2 years ago
VisualAIKHU / Keyword-DETR
View on GitHub
Official Repository for "Watch Video, Catch Keyword: Context-aware Keyword Attention for Moment Retrieval and Highlight Detection" (AAAI …
☆15Mar 1, 2025Updated last year
phosseini / GisPy
View on GitHub
GisPy: A Tool for Measuring Gist Inference Score in Text https://aclanthology.org/2022.wnu-1.5/
☆13Jul 1, 2024Updated 2 years ago
jasonppy / syllable-discovery
View on GitHub
Syllable Segmentation and Cross-Lingual Generalization in a Visually Grounded, Self-Supervised Speech Model
☆35Aug 27, 2023Updated 2 years ago
sonalkum / MMAUPro
View on GitHub
Official repo for MMAU-Pro Benchmark
☆22Sep 25, 2025Updated 9 months ago
ioanacroi / longmoment-detr
View on GitHub
Moment Detection in Long Tutorial Videos
☆20May 8, 2024Updated 2 years ago
kamperh / vqwordseg
View on GitHub
Unsupervised phone and word segmentation using dynamic programming on self-supervised VQ features.
☆39May 5, 2026Updated 2 months ago
AlanBaade / MAE-AST-Public
View on GitHub
Public Code for the paper MAE-AST: Masked Autoencoding Audio Spectrogram Transformer
☆93Jun 9, 2022Updated 4 years ago
edsonroteia / cav-mae-sync
View on GitHub
[CVPR25] Official Implementation of CAV-MAE Sync
☆31Apr 5, 2026Updated 3 months ago
Simple, predictable pricing with DigitalOcean hosting • Ad
Always know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
Cloud-CV / vilbert-multi-task
View on GitHub
12-in-1: Multi-Task Vision and Language Representation Learning Web Demo
☆35Dec 8, 2022Updated 3 years ago
YuanGongND / uavm
View on GitHub
Code for the IEEE Signal Processing Letters 2022 paper "UAVM: Towards Unifying Audio and Visual Models".
☆57Apr 20, 2023Updated 3 years ago
hearbenchmark / hear2021-submitted-models
View on GitHub
Open-source audio embedding models, submitted to the HEAR 2021 challenge
☆11Feb 15, 2026Updated 5 months ago
Ego4DSounds / Ego4DSounds
View on GitHub
Ego4DSounds: A diverse egocentric dataset with high action-audio correspondence
☆21Jun 14, 2024Updated 2 years ago
MGitHubL / TMac
View on GitHub
☆14Feb 26, 2024Updated 2 years ago
TAMS-Group / tams_glass_reconstruction
View on GitHub
Detection and Reconstruction of Transparent Objects with Infrared Projection-based RGB-D Cameras
☆13Jan 17, 2021Updated 5 years ago
ChanganVR / action2sound
View on GitHub
Action2Sound: Ambient-Aware Generation of Action Sounds from Egocentric Videos
☆26Oct 1, 2024Updated last year
llyx97 / TAMT
View on GitHub
[NAACL 2022] "Learning to Win Lottery Tickets in BERT Transfer via Task-agnostic Mask Training", Yuanxin Liu, Fandong Meng, Zheng Lin, Pe…
☆15Oct 18, 2022Updated 3 years ago
hvy / chainer-faster-rcnn
View on GitHub
☆10Apr 22, 2016Updated 10 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
FreedomIntelligence / DPTDR
View on GitHub
Code for COLING22 paper, DPTDR: Deep Prompt Tuning for Dense Passage Retrieval
☆26Aug 7, 2023Updated 2 years ago
roudimit / Omni-R1
View on GitHub
[ASRU 2025] Omni-R1: Do You Really Need Audio to Fine-Tune Your Audio LLM?
☆47Nov 21, 2025Updated 8 months ago
gjhhust / XS-VID
View on GitHub
XS-VID: An Extra Small Object Video Detection Dataset
☆10Mar 4, 2025Updated last year
sarulab-speech / ml-audiocaps
View on GitHub
Multi-lingual AudioCaps
☆14Nov 20, 2023Updated 2 years ago
Kowalski1024 / Mi-Go
View on GitHub
Mi-Go is an open-source test framework designed to evaluate and compare the accuracy of speech-to-text models on YouTube dataset.
☆12Jul 2, 2024Updated 2 years ago
Coolgenome / GastricCancer
View on GitHub
☆12Mar 5, 2024Updated 2 years ago
nikvaessen / disjoint-mtl
View on GitHub
Research code for "Towards multi-task learning of speech and speaker recognition" at https://arxiv.org/pdf/2302.12773.pdf
☆12Dec 2, 2024Updated last year
nttcslab / dcase2023_task2_evaluator
View on GitHub
☆12Aug 10, 2023Updated 2 years ago
KinWaiCheuk / IJCNN2020_music_transcription
View on GitHub
source code for the paper publised in IJCNN 2020 "The Impact of Audio Input Representations on Neural Network based Music Transcription"
☆13Apr 9, 2020Updated 6 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
qinenergy / webvision-2020-public
View on GitHub
Webvision Challenge 2020 developer kit
☆10Dec 8, 2022Updated 3 years ago
BoyuanChen / boombox
View on GitHub
Code release for paper: The Boombox: Visual Reconstruction from Acoustic Vibrations
☆15May 18, 2021Updated 5 years ago
Hertin / WavPrompt
View on GitHub
☆37Jun 30, 2022Updated 4 years ago
DTaoo / DMC
View on GitHub
Code for Deep Multimodal Clustering for Unsupervised Audiovisual Learning (CVPR2019)
☆15May 27, 2020Updated 6 years ago
fundamentalvision / Siamese-Image-Modeling
View on GitHub
☆16Jul 7, 2023Updated 3 years ago
IBM / comical
View on GitHub
Contrastive multi-omics association learning
☆13Apr 28, 2026Updated 2 months ago
archival-archetyping / i.frame
View on GitHub
i.frame is an open-source platform for decentralized online events. You can provide cohesion and coherence temporarily to the programs di…
☆12Dec 26, 2021Updated 4 years ago