arxyzan/data2vec-pytorch

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/arxyzan/data2vec-pytorch)

arxyzan / data2vec-pytorch

PyTorch implementation of "data2vec: A General Framework for Self-supervised Learning in Speech, Vision and Language" from Meta AI

☆187

Alternatives and similar repositories for data2vec-pytorch

Users that are interested in data2vec-pytorch are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Guillem96 / data2vec-vision
View on GitHub
PyTorch implementation of Data2Vec self-supervised approach for vision use cases.
☆18Oct 7, 2022Updated 3 years ago
facebookresearch / data2vec_vision
View on GitHub
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
☆81Jan 7, 2026Updated 6 months ago
arxyzan / vanilla-transformer
View on GitHub
A clean PyTorch implementation of the original Transformer model + A German -> English translation example
☆38Jan 24, 2022Updated 4 years ago
arxyzan / flask-docker-opencv-nginx
View on GitHub
A web app built with Flask to stream video feed from any IP camera to the users of a local network
☆14Dec 23, 2021Updated 4 years ago
dataak / fastapi-blueprint
View on GitHub
FastAPI CLI is a command-line tool designed to help developers quickly generate a structured project file system for FastAPI applications…
☆12Feb 3, 2025Updated last year
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
glory20h / FitHuBERT
View on GitHub
FitHuBERT: Going Thinner and Deeper for Knowledge Distillation of Speech Self-Supervised Learning (INTERSPEECH 2022)
☆19Nov 15, 2023Updated 2 years ago
ECNU-Cross-Innovation-Lab / ShiftSER
View on GitHub
[ICASSP 2023] Mingling or Misalignment? Temporal Shift for Speech Emotion Recognition with Pre-trained Representations
☆39Dec 18, 2023Updated 2 years ago
igormq / speech2text
View on GitHub
☆12Feb 9, 2021Updated 5 years ago
Hamtech-ai / Persian-Image-Captioning
View on GitHub
A Persian Image Captioning model based on Vision Encoder Decoder Models of the transformers🤗.
☆20Feb 27, 2022Updated 4 years ago
arxyzan / fraud-detection-gnn
View on GitHub
Fraud Detection using various GNN models
☆19Jul 2, 2023Updated 3 years ago
HolgerBovbjerg / data2vec-KWS
View on GitHub
This repository contains code for applying Data2Vec to pretrain Keyword Transformer model as described in "Improving Label-Deficient Keyw…
☆32Mar 6, 2025Updated last year
ducanhdt / openai_whisper_finetuning
View on GitHub
☆49Apr 28, 2023Updated 3 years ago
lijuncheng16 / AudioTaggingDoneRight
View on GitHub
experiments about AudioSet
☆43Jul 22, 2023Updated 3 years ago
JindongGu / SimDis
View on GitHub
A pytorch implementation of the ICCV2021 workshop paper SimDis: Simple Distillation Baselines for Improving Small Self-supervised Models
☆14Jul 15, 2021Updated 5 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
sungnyun / ARMHuBERT
View on GitHub
(Interspeech 2023 & ICASSP 2024) Official repository for ARMHuBERT and STaRHuBERT
☆41Aug 29, 2024Updated last year
JinhuaLiang / lam4fsl
View on GitHub
An official repo for the paper "Adapting Language-Audio Models as Few-Shot Audio Learners"
☆31May 31, 2023Updated 3 years ago
mechanicalsea / lighthubert
View on GitHub
LightHuBERT: Lightweight and Configurable Speech Representation Learning with Once-for-All Hidden-Unit BERT
☆73Sep 26, 2022Updated 3 years ago
gaasher / data2vec2.0_vision
View on GitHub
Implementation of Dat2Vec2.0 for vision
☆18Feb 6, 2023Updated 3 years ago
vectominist / MiniASR
View on GitHub
A mini, simple, and fast end-to-end automatic speech recognition toolkit.
☆53Dec 6, 2022Updated 3 years ago
Taeu / HeLP-Challenge-Goldenpass
View on GitHub
☆11Mar 12, 2019Updated 7 years ago
nervjack2 / MelHuBERT
View on GitHub
Official implementation of MelHuBERT
☆70Feb 21, 2026Updated 5 months ago
ahaliassos / usr2
View on GitHub
PyTorch implementation of USR 2.0 (ICLR 2026)
☆15Apr 3, 2026Updated 3 months ago
sungnyun / avsr-temporal-dynamics
View on GitHub
(SLT 2024) Learning Video Temporal Dynamics with Cross-Modal Attention for Robust Audio-Visual Speech Recognition
☆13Oct 22, 2024Updated last year
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
tbenst / silent_speech
View on GitHub
Official repository for "A Cross-Modal Approach to Silent Speech with LLM-Enhanced Recognition"
☆18Mar 14, 2024Updated 2 years ago
KiLJ4EdeN / Abnormal_Heart_Sound_Diagnosis
View on GitHub
Convolutional Long Short-term memory ( CNN - LSTM ) evaluation on the heart sound database with 91% accuracy.
☆17Jan 21, 2020Updated 6 years ago
Sreyan88 / LAPE
View on GitHub
A unified framework for Low-resource Audio Processing and Evaluation (SSL Pre-training and Downstream Fine-tuning)
☆29Jul 9, 2024Updated 2 years ago
nttcslab / dcase2023_task2_evaluator
View on GitHub
☆12Aug 10, 2023Updated 2 years ago
b04901014 / FT-w2v2-ser
View on GitHub
Official implementation for the paper Exploring Wav2vec 2.0 fine-tuning for improved speech emotion recognition
☆153Oct 26, 2021Updated 4 years ago
Sreyan88 / LipGER
View on GitHub
Code for InterSpeech 2024 Paper: LipGER: Visually-Conditioned Generative Error Correction for Robust Automatic Speech Recognition
☆19Jul 16, 2024Updated 2 years ago
nttcslab / composing-general-audio-repr
View on GitHub
Composing General Audio Representation by Fusing Multilayer Features of a Pre-trained Model
☆26Apr 26, 2023Updated 3 years ago
KiLJ4EdeN / Realtime_FacialRecognition
View on GitHub
Real time Facial Recognition using the face_recognition module and ip cameras.
☆22Sep 15, 2020Updated 5 years ago
ciaua / score_lyrics_free_svg
View on GitHub
Score- and Lyrics-Free Singing Voice Generation
☆28May 25, 2020Updated 6 years ago
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
ddlBoJack / MT4SSL
View on GitHub
[INTERSPEECH 2023 Best Paper Shortlist] Official implementation for MT4SSL: Boosting Self-Supervised Speech Representation Learning by In…
☆45Mar 25, 2024Updated 2 years ago
MGitHubL / TMac
View on GitHub
☆14Feb 26, 2024Updated 2 years ago
asappresearch / wav2seq
View on GitHub
Official code for Wav2Seq
☆97Jul 19, 2022Updated 4 years ago
david-gimeno / tailored-avsr
View on GitHub
Official source code for the paper "Tailored Design of Audio-Visual Speech Recognition Models using Branchformers"
☆15Feb 24, 2025Updated last year
karthikbhamidipati / multi-task-speech-classification
View on GitHub
Multi-Task Speech classification of accent and gender of an english speaker on Mozilla's common voice dataset
☆28Jul 17, 2026Updated last week
jindongwang / EasyEspnet
View on GitHub
Making Espnet easier to use
☆54Apr 9, 2021Updated 5 years ago
Helw150 / levanter
View on GitHub
Legible, Scalable, Reproducible Foundation Models with Named Tensors and Jax
☆16Jun 16, 2024Updated 2 years ago