Code for paper "Patch-Level Training for Large Language Models"
☆96Nov 10, 2025Updated 3 months ago
Alternatives and similar repositories for PatchTrain
Users that are interested in PatchTrain are comparing it to the libraries listed below
Sorting:
- A Toolkit for a series of Young projects.☆23Apr 30, 2021Updated 4 years ago
- Official Implementation for the ICLR2023 paper "Fuzzy Alignments in Directed Acyclic Graph for Non-autoregressive Machine Translation"☆14Mar 1, 2023Updated 2 years ago
- Official implementation of Self-Taught Agentic Long Context Understanding (ACL 2025).☆12Sep 22, 2025Updated 5 months ago
- Interface Design for Self-Supervised Speech Models, Accepted to Interspeech2024☆16Nov 19, 2024Updated last year
- DST is a Decoder-only simultaneous machine translation model, which can conduct policy decision and translation concurrently☆11Jun 6, 2024Updated last year
- Code for NeurIPS 2023 paper "Non-autoregressive Machine Translation with Probabilistic Context-free Grammar".☆12Jan 4, 2024Updated 2 years ago
- Code for EMNLP 2022 main conference paper "Low-resource Neural Machine Translation with Cross-modal Alignment".☆14Apr 25, 2023Updated 2 years ago
- Code for EMNLP 2022 main conference paper "Information-Transport-based Policy for Simultaneous Translation"☆13Nov 3, 2022Updated 3 years ago
- Code for ACL 2022 main conference paper "Modeling Dual Read/Write Paths for Simultaneous Machine Translation"☆12Mar 31, 2022Updated 3 years ago
- The official repo of continuous speculative decoding☆31Mar 28, 2025Updated 11 months ago
- The official repository for SkyLadder: Better and Faster Pretraining via Context Window Scheduling☆42Dec 29, 2025Updated 2 months ago
- Code for NeurIPS 2022 Spotlight paper " Non-Monotonic Latent Alignments for CTC-Based Non-Autoregressive Machine Translation"☆20Nov 16, 2022Updated 3 years ago
- [EMNLP 2024] Tree of Problems: Improving structured problem solving with compositionality☆19Mar 4, 2025Updated 11 months ago
- [NeurIPS 2024] | An Efficient Recipe for Long Context Extension via Middle-Focused Positional Encoding☆21Oct 10, 2024Updated last year
- Code for the paper "Function-Space Learning Rates"☆25Jun 3, 2025Updated 8 months ago
- SiLLM is a Simultaneous Machine Translation (SiMT) Framework. It utilizes a Large Language model as the translation model and employs a t…☆18Feb 22, 2024Updated 2 years ago
- [ICML 2024] Official Repository for the paper "Transformers Get Stable: An End-to-End Signal Propagation Theory for Language Models"☆10Jul 19, 2024Updated last year
- ☆21Sep 5, 2023Updated 2 years ago
- [ACL'24 Oral] Analysing The Impact of Sequence Composition on Language Model Pre-Training☆23Aug 18, 2024Updated last year
- The code for AAAI 2025 “Large Language Models Are Read/Write Policy-Makers for Simultaneous Generation”