FairSeq repo with Apollo optimizer
☆113Dec 20, 2023Updated 2 years ago
Alternatives and similar repositories for fairseq-apollo
Users that are interested in fairseq-apollo are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A PyTorch Implementation of the Luna: Linear Unified Nested Attention☆41Jul 29, 2021Updated 4 years ago
- Sequence modeling with Mega.☆303Jan 28, 2023Updated 3 years ago
- Deep neural models for core NLP tasks☆13Nov 9, 2017Updated 8 years ago
- A method for evaluating the high-level coherence of machine-generated texts. Identifies high-level coherence issues in transformer-based …☆11Mar 18, 2023Updated 3 years ago
- Efficient PScan implementation in PyTorch☆17Jan 2, 2024Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- Implementation of Mega, the Single-head Attention with Multi-headed EMA architecture that currently holds SOTA on Long Range Arena☆207Aug 26, 2023Updated 2 years ago
- Apollo: An Adaptive Parameter-wise Diagonal Quasi-Newton Method for Nonconvex Stochastic Optimization☆182Nov 21, 2021Updated 4 years ago
- The accompanying code for "Simplifying and Understanding State Space Models with Diagonal Linear RNNs" (Ankit Gupta, Harsh Mehta, Jonatha…☆23Dec 30, 2022Updated 3 years ago
- The official repository for Efficient Long-Text Understanding Using Short-Text Models (Ivgi et al., 2022) paper☆70May 14, 2023Updated 2 years ago
- ☆14Nov 20, 2022Updated 3 years ago
- ☆16Oct 16, 2024Updated last year
- Fine-Tuning Pre-trained Transformers into Decaying Fast Weights☆19Oct 9, 2022Updated 3 years ago
- Official PyTorch Implementation of Long-Short Transformer (NeurIPS 2021).☆228Apr 18, 2022Updated 3 years ago
- MeCab model trained with OpenKorPos.☆23Jun 19, 2022Updated 3 years ago
- Wordpress hosting with auto-scaling on Cloudways • AdFully Managed hosting built for WordPress-powered businesses that need reliable, auto-scalable hosting. Cloudways SafeUpdates now available.
- ☆22Dec 1, 2021Updated 4 years ago
- A utility for storing and reading files for Korean LM training 💾☆35Oct 15, 2025Updated 5 months ago
- PyTorch Implementation of NeurIPS 2020 paper "Learning Sparse Prototypes for Text Generation"☆22Jul 8, 2021Updated 4 years ago
- Repo for ICML23 "Why do Nearest Neighbor Language Models Work?"☆59Jan 12, 2023Updated 3 years ago
- Variable-order CRFs with structure learning☆17Aug 1, 2024Updated last year
- ENGINE: Energy-Based Inference Networks for Non-Autoregressive Machine Translation☆25Oct 2, 2020Updated 5 years ago
- ☆20Sep 17, 2021Updated 4 years ago
- ☆44Sep 16, 2020Updated 5 years ago
- Implementation of a Transformer using ReLA (Rectified Linear Attention) from https://arxiv.org/abs/2104.07012☆49Apr 6, 2022Updated 4 years ago
- End-to-end encrypted cloud storage - Proton Drive • AdSpecial offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
- ☆13Feb 7, 2023Updated 3 years ago
- Implementation of Long-Short Transformer, combining local and global inductive biases for attention over long sequences, in Pytorch☆120Aug 4, 2021Updated 4 years ago
- My implementation of the gMLP model from the paper "Pay Attention to MLPs".☆25May 25, 2021Updated 4 years ago
- Long Range Arena for Benchmarking Efficient Transformers☆788Dec 16, 2023Updated 2 years ago
- [ICLR 2023] Codebase for Copy-Generator model, including an implementation of kNN-LM☆190Jan 27, 2025Updated last year
- Momentum Decoding: Open-ended Text Generation as Graph Exploration☆19Jan 27, 2023Updated 3 years ago
- [NeurIPS'22 Spotlight] A Contrastive Framework for Neural Text Generation☆476Mar 7, 2024Updated 2 years ago
- ☆14Apr 12, 2023Updated 2 years ago
- Invertible Generative Flows☆85Apr 1, 2021Updated 5 years ago
- NordVPN Special Discount Offer • AdSave on top-rated NordVPN 1 or 2-year plans with secure browsing, privacy protection, and support for for all major platforms.
- FaVIQ: Fact Verification from Information-seeking Questions☆43Nov 23, 2022Updated 3 years ago
- Code Repository for "Please Mind the Root: Decoding Arborescences for Dependency Parsing" and "On Finding the K-best Non-projective Depen…☆20Dec 12, 2022Updated 3 years ago
- Easy Language Model Pretraining leveraging Huggingface's Transformers and Datasets☆130Nov 12, 2022Updated 3 years ago
- Pytorch implementation of "Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions", ICASSP, 2018.☆19Jan 21, 2021Updated 5 years ago
- [EMNLP 2023] Official implementation of the algorithm ETSC: Exact Toeplitz-to-SSM Conversion our EMNLP 2023 paper - Accelerating Toeplitz…☆14Oct 17, 2023Updated 2 years ago
- Implementation of QKVAE☆11Feb 24, 2023Updated 3 years ago
- Open-sourcing code associated with the AAAI-25 paper "On the Expressiveness and Length Generalization of Selective State-Space Models on …☆16Sep 18, 2025Updated 6 months ago