facebookresearch / fairseq2Links
FAIR Sequence Modeling Toolkit 2
☆1,025Updated this week
Alternatives and similar repositories for fairseq2
Users that are interested in fairseq2 are comparing it to the libraries listed below
Sorting:
- [ICLR 2025] SOTA discrete acoustic codec models with 40/75 tokens per second for audio language modeling☆1,178Updated 5 months ago
- Inworld TTS☆442Updated last week
- SONAR, a new multilingual and multimodal fixed-size sentence embedding space, with a full suite of speech and text encoders and decoders.☆797Updated 2 weeks ago
- A library for preparing data for machine translation research (monolingual preprocessing, bitext mining, etc.) built by the FAIR NLLB te…☆282Updated 6 months ago
- NeMo text processing for ASR and TTS☆351Updated last week
- Code for the ALiBi method for transformer language models (ICLR 2022)☆539Updated last year
- AcademiCodec: An Open Source Audio Codec Model for Academic Research☆636Updated last year
- Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Heads☆2,593Updated last year
- This is the code for the SpeechTokenizer presented in the SpeechTokenizer: Unified Speech Tokenizer for Speech Language Models. Samples a…☆590Updated last year
- The official implementation of Self-Play Preference Optimization (SPPO)☆575Updated 6 months ago
- The codes about "Uni-MoE: Scaling Unified Multimodal Models with Mixture of Experts"☆745Updated this week
- Towards Open-source GPT-4o with Vision, Speech and Duplex Capabilities。☆1,788Updated 6 months ago
- Multi-Scale Neural Audio Codec (SNAC) compresses audio into discrete codes at a low bitrate☆639Updated 8 months ago
- [NeurIPS 2024] BAdam: A Memory Efficient Full Parameter Optimization Method for Large Language Models☆265Updated 4 months ago
- Library for Textless Spoken Language Processing☆549Updated last year
- Implementation of MEGABYTE, Predicting Million-byte Sequences with Multiscale Transformers, in Pytorch☆647Updated 7 months ago
- State-of-the-art LLM-based translation models.☆548Updated 4 months ago
- One-for-All Multimodal Evaluation Toolkit Across Text, Image, Video, and Audio Tasks☆2,953Updated this week
- Macaw-LLM: Multi-Modal Language Modeling with Image, Video, Audio, and Text Integration☆1,579Updated 7 months ago
- PyTorch Implementation of ProDiff (ACM-MM'22) with a Extremely-Fast diffusion speech synthesis pipeline☆433Updated 2 years ago
- ☆1,444Updated last year
- A Neural Framework for MT Evaluation☆642Updated this week
- ☆668Updated this week
- Implementation of Voicebox, new SOTA Text-to-speech network from MetaAI, in Pytorch☆659Updated 10 months ago
- [ICML 2024] Break the Sequential Dependency of LLM Inference Using Lookahead Decoding☆1,267Updated 5 months ago
- Facebook Low Resource (FLoRes) MT Benchmark☆747Updated last year
- PyTorch Implementation of FastDiff (IJCAI'22)☆411Updated last year
- Awesome speech/audio LLMs, representation learning, and codec models☆1,093Updated 3 weeks ago
- Speech, Language, Audio, Music Processing with Large Language Model☆868Updated this week
- Scalable toolkit for efficient model alignment☆834Updated 2 weeks ago