arxyzan / data2vec-pytorch
PyTorch implementation of "data2vec: A General Framework for Self-supervised Learning in Speech, Vision and Language" from Meta AI
☆177Updated last year
Alternatives and similar repositories for data2vec-pytorch
Users that are interested in data2vec-pytorch are comparing it to the libraries listed below
Sorting:
- SpeechCLIP: Integrating Speech with Pre-Trained Vision and Language Model, Accepted to IEEE SLT 2022☆115Updated 2 years ago
- Implementation of Zorro, Masked Multimodal Transformer, in Pytorch☆97Updated last year
- Sequence modeling with Mega.☆295Updated 2 years ago
- Official code for Wav2Seq☆96Updated 2 years ago
- Official PyTorch Implementation of Long-Short Transformer (NeurIPS 2021).☆225Updated 3 years ago
- A simple cross attention that updates both the source and target in one step☆172Updated last year
- A curated list of awesome adversarial reprogramming and input prompting methods for neural networks since 2022☆36Updated last year
- [NeurIPS'22] Squeezeformer: An Efficient Transformer for Automatic Speech Recognition☆251Updated 2 years ago
- An implementation of the Contrast Predictive Coding (CPC) method to train audio features in an unsupervised fashion.☆359Updated 3 years ago
- This repo hosts the code and models of "Masked Autoencoders that Listen".☆585Updated last year
- The repo host the code and model of MAViL.☆42Updated last year
- Masked Spectrogram Modeling using Masked Autoencoders for Learning General-purpose Audio Representations☆90Updated 10 months ago
- ☆164Updated 2 years ago
- Code and Pretrained Models for ICLR 2023 Paper "Contrastive Audio-Visual Masked Autoencoder".☆256Updated last year
- Unofficial PyTorch Implementation for pNLP-Mixer: an Efficient all-MLP Architecture for Language (https://arxiv.org/abs/2202.04350)☆63Updated 3 years ago
- Audio Captioning datasets for PyTorch.☆117Updated last month
- BYOL for Audio: Self-Supervised Learning for General-Purpose Audio Representation☆212Updated 2 years ago
- Implementation of Linformer for Pytorch☆286Updated last year
- Implementation of Dat2Vec2.0 for vision☆18Updated 2 years ago
- Pytorch implementation of stochastically quantized variational autoencoder (SQ-VAE)☆187Updated 2 years ago
- Python code for handling the Clotho dataset.☆82Updated 4 years ago
- An Audio Language model for Audio Tasks☆304Updated last year
- LightHuBERT: Lightweight and Configurable Speech Representation Learning with Once-for-All Hidden-Unit BERT☆73Updated 2 years ago
- SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition☆78Updated 4 years ago
- Implementation of Mega, the Single-head Attention with Multi-headed EMA architecture that currently holds SOTA on Long Range Arena☆203Updated last year
- A minimal pytorch package implementing a gradient reversal layer.☆158Updated 6 months ago
- PyTorch implementation of "Squeezeformer: An Efficient Transformer for Automatic Speech Recognition" (NeurIPS 2022)☆141Updated 2 years ago
- Code for the IEEE Signal Processing Letters 2022 paper "UAVM: Towards Unifying Audio and Visual Models".☆55Updated 2 years ago
- Code for the AAAI 2022 paper "SSAST: Self-Supervised Audio Spectrogram Transformer".☆381Updated 2 years ago
- BDDM: Bilateral Denoising Diffusion Models for Fast and High-Quality Speech Synthesis☆228Updated 2 years ago