facebookresearch / fairseq2Links
FAIR Sequence Modeling Toolkit 2
☆1,101Updated this week
Alternatives and similar repositories for fairseq2
Users that are interested in fairseq2 are comparing it to the libraries listed below
Sorting:
- [ICLR 2025] SOTA discrete acoustic codec models with 40/75 tokens per second for audio language modeling☆1,246Updated 9 months ago
- SONAR, a new multilingual and multimodal fixed-size sentence embedding space, with a full suite of speech and text encoders and decoders.☆851Updated 2 months ago
- Inworld TTS☆583Updated 3 months ago
- Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Heads☆2,680Updated last year
- A library for preparing data for machine translation research (monolingual preprocessing, bitext mining, etc.) built by the FAIR NLLB te…☆287Updated 2 months ago
- NeMo text processing for ASR and TTS☆400Updated last week
- This is the code for the SpeechTokenizer presented in the SpeechTokenizer: Unified Speech Tokenizer for Speech Language Models. Samples a…☆637Updated last year
- The official implementation of Self-Play Preference Optimization (SPPO)☆583Updated 11 months ago
- Library for Textless Spoken Language Processing☆554Updated 2 years ago
- AcademiCodec: An Open Source Audio Codec Model for Academic Research☆662Updated last year
- Audio Large Language Models☆826Updated 5 months ago
- Macaw-LLM: Multi-Modal Language Modeling with Image, Video, Audio, and Text Integration☆1,593Updated 11 months ago
- One-for-All Multimodal Evaluation Toolkit Across Text, Image, Video, and Audio Tasks☆3,437Updated last week
- SimulEval: A General Evaluation Toolkit for Simultaneous Translation☆119Updated last year
- MuAViC: A Multilingual Audio-Visual Corpus for Robust Speech Recognition and Robust Speech-to-Text Translation☆401Updated 2 years ago
- Code, Dataset, and Pretrained Models for Audio and Speech Large Language Model "Listen, Think, and Understand".☆464Updated last year
- Uni-MoE: Lychee's Large Multimodal Model Family.☆1,049Updated last week
- Multi-Scale Neural Audio Codec (SNAC) compresses audio into discrete codes at a low bitrate☆733Updated last year
- [NeurIPS 2024] BAdam: A Memory Efficient Full Parameter Optimization Method for Large Language Models☆278Updated 9 months ago
- A Framework for Speech, Language, Audio, Music Processing with Large Language Model☆939Updated 2 months ago
- Fast & Simple repository for pre-training and fine-tuning T5-style models☆1,017Updated last year
- Facebook Low Resource (FLoRes) MT Benchmark☆757Updated 2 years ago
- Code for the ALiBi method for transformer language models (ICLR 2022)☆547Updated 2 years ago
- A large-scale multilingual speech corpus for representation learning, semi-supervised learning and interpretation☆564Updated 2 years ago
- Towards Open-source GPT-4o with Vision, Speech and Duplex Capabilities。☆1,847Updated 11 months ago
- ☆966Updated this week
- PyTorch Implementation of ProDiff (ACM-MM'22) with a Extremely-Fast diffusion speech synthesis pipeline☆432Updated 2 years ago
- Cramming the training of a (BERT-type) language model into limited compute.☆1,355Updated last year
- [ICML 2024] Break the Sequential Dependency of LLM Inference Using Lookahead Decoding☆1,310Updated 9 months ago
- SALMONN family: A suite of advanced multi-modal LLMs☆1,373Updated 2 months ago