arxyzan / data2vec-pytorch
PyTorch implementation of "data2vec: A General Framework for Self-supervised Learning in Speech, Vision and Language" from Meta AI
☆176Updated last year
Alternatives and similar repositories for data2vec-pytorch:
Users that are interested in data2vec-pytorch are comparing it to the libraries listed below
- SpeechCLIP: Integrating Speech with Pre-Trained Vision and Language Model, Accepted to IEEE SLT 2022☆111Updated 2 years ago
- Implementation of Linformer for Pytorch☆266Updated last year
- This repo hosts the code and models of "Masked Autoencoders that Listen".☆566Updated 10 months ago
- Implementation of the convolutional module from the Conformer paper, for use in Transformers☆383Updated last year
- [NeurIPS'22] Squeezeformer: An Efficient Transformer for Automatic Speech Recognition☆249Updated 2 years ago
- Official PyTorch Implementation of Long-Short Transformer (NeurIPS 2021).☆226Updated 2 years ago
- Implementation of Fast Transformer in Pytorch☆172Updated 3 years ago
- Unofficial PyTorch Implementation for pNLP-Mixer: an Efficient all-MLP Architecture for Language (https://arxiv.org/abs/2202.04350)☆63Updated 2 years ago
- 🔊 Repository for our NAACL-HLT 2019 paper: AudioCaps☆151Updated 9 months ago
- [ASRU 2021] Efficient Conformer: Progressive Downsampling and Grouped Attention for Automatic Speech Recognition☆213Updated last year
- Implementation of H-Transformer-1D, Hierarchical Attention for Sequence Learning☆159Updated last year
- BYOL for Audio: Self-Supervised Learning for General-Purpose Audio Representation☆209Updated last year
- An implementation of local windowed attention for language modeling☆412Updated last month
- PyTorch implementation of "Squeezeformer: An Efficient Transformer for Automatic Speech Recognition" (NeurIPS 2022)☆133Updated 2 years ago
- [DEPRECATED] A knowledge distillation toolkit based on PyTorch and PyTorch Lightning.☆139Updated 11 months ago
- An implementation of the Contrast Predictive Coding (CPC) method to train audio features in an unsupervised fashion.☆357Updated 3 years ago
- Sequence modeling with Mega.☆298Updated 2 years ago
- Implementation of Zorro, Masked Multimodal Transformer, in Pytorch☆95Updated last year
- Code for the ALiBi method for transformer language models (ICLR 2022)☆515Updated last year
- A simple cross attention that updates both the source and target in one step☆162Updated 9 months ago
- Code and Pretrained Models for ICLR 2023 Paper "Contrastive Audio-Visual Masked Autoencoder".☆244Updated 10 months ago
- Pytorch implementation of stochastically quantized variational autoencoder (SQ-VAE)☆185Updated 2 years ago
- The repo host the code and model of MAViL.☆42Updated last year
- An Audio Language model for Audio Tasks☆301Updated 9 months ago
- Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities☆78Updated 2 years ago
- Official code for Wav2Seq☆96Updated 2 years ago
- Implementation of Memformer, a Memory-augmented Transformer, in Pytorch☆110Updated 4 years ago
- Unofficial implementation of Google's FNet: Mixing Tokens with Fourier Transforms☆257Updated 3 years ago
- ☆163Updated 2 years ago
- Implementation of Mega, the Single-head Attention with Multi-headed EMA architecture that currently holds SOTA on Long Range Arena☆204Updated last year