facebookresearch / fairseq2Links
FAIR Sequence Modeling Toolkit 2
☆1,044Updated this week
Alternatives and similar repositories for fairseq2
Users that are interested in fairseq2 are comparing it to the libraries listed below
Sorting:
- [ICLR 2025] SOTA discrete acoustic codec models with 40/75 tokens per second for audio language modeling☆1,219Updated 7 months ago
- NeMo text processing for ASR and TTS☆380Updated last week
- Inworld TTS☆516Updated last month
- A library for preparing data for machine translation research (monolingual preprocessing, bitext mining, etc.) built by the FAIR NLLB te…☆283Updated 2 weeks ago
- SONAR, a new multilingual and multimodal fixed-size sentence embedding space, with a full suite of speech and text encoders and decoders.☆828Updated 3 weeks ago
- This is the code for the SpeechTokenizer presented in the SpeechTokenizer: Unified Speech Tokenizer for Speech Language Models. Samples a…☆615Updated last year
- AcademiCodec: An Open Source Audio Codec Model for Academic Research☆655Updated last year
- Multi-Scale Neural Audio Codec (SNAC) compresses audio into discrete codes at a low bitrate☆691Updated 11 months ago
- Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Heads☆2,650Updated last year
- Code for the ALiBi method for transformer language models (ICLR 2022)☆544Updated 2 years ago
- The official implementation of Self-Play Preference Optimization (SPPO)☆581Updated 9 months ago
- Library for Textless Spoken Language Processing☆552Updated 2 years ago
- [NeurIPS 2024] BAdam: A Memory Efficient Full Parameter Optimization Method for Large Language Models☆272Updated 7 months ago
- PyTorch Implementation of ProDiff (ACM-MM'22) with a Extremely-Fast diffusion speech synthesis pipeline☆435Updated 2 years ago
- Audio Large Language Models☆770Updated 3 months ago
- Towards Open-source GPT-4o with Vision, Speech and Duplex Capabilities。☆1,834Updated 9 months ago
- A Neural Framework for MT Evaluation☆678Updated 2 months ago
- Macaw-LLM: Multi-Modal Language Modeling with Image, Video, Audio, and Text Integration☆1,585Updated 10 months ago
- Vocos: Closing the gap between time-domain and Fourier-based neural vocoders for high-quality audio synthesis☆1,004Updated last year
- The first Large Audio Language Model that enables native in-depth thinking, which is trained on large-scale audio Chain-of-Thought data.☆265Updated 5 months ago
- CoVoST: A Large-Scale Multilingual Speech-To-Text Translation Corpus (CC0 Licensed)☆389Updated 4 years ago
- Uni-MoE: Lychee's Large Multimodal Model Family.☆793Updated last week
- Central place for the engineering/scaling WG: documentation, SLURM scripts and logs, compute environment and data.☆1,006Updated last year
- PyTorch Implementation of FastDiff (IJCAI'22)☆411Updated last year
- Implementation of MEGABYTE, Predicting Million-byte Sequences with Multiscale Transformers, in Pytorch☆654Updated 10 months ago
- SimulEval: A General Evaluation Toolkit for Simultaneous Translation☆117Updated last year
- [ICCV 2025] Official Implementation for "Lyra: An Efficient and Speech-Centric Framework for Omni-Cognition"☆301Updated 9 months ago
- Large Context Attention☆746Updated 2 weeks ago
- Reference BLEU implementation that auto-downloads test sets and reports a version string to facilitate cross-lab comparisons☆1,184Updated 2 months ago
- A pytorch quantization backend for optimum☆999Updated last week