Code for paper "Patch-Level Training for Large Language Models"
☆97Nov 10, 2025Updated 5 months ago
Alternatives and similar repositories for PatchTrain
Users that are interested in PatchTrain are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Code for NeurIPS 2023 paper "Non-autoregressive Machine Translation with Probabilistic Context-free Grammar".☆12Jan 4, 2024Updated 2 years ago
- DST is a Decoder-only simultaneous machine translation model, which can conduct policy decision and translation concurrently☆11Jun 6, 2024Updated last year
- Official Implementation for the ICLR2023 paper "Fuzzy Alignments in Directed Acyclic Graph for Non-autoregressive Machine Translation"☆14Mar 1, 2023Updated 3 years ago
- ☆21Sep 5, 2023Updated 2 years ago
- Code for ACL 2022 main conference paper "Modeling Dual Read/Write Paths for Simultaneous Machine Translation"☆12Mar 31, 2022Updated 4 years ago
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- Code for EMNLP 2022 main conference paper "Low-resource Neural Machine Translation with Cross-modal Alignment".☆15Apr 25, 2023Updated 2 years ago
- Code for EMNLP 2022 main conference paper "Information-Transport-based Policy for Simultaneous Translation"☆13Nov 3, 2022Updated 3 years ago
- Code for NeurIPS 2022 Spotlight paper " Non-Monotonic Latent Alignments for CTC-Based Non-Autoregressive Machine Translation"☆20Nov 16, 2022Updated 3 years ago
- Code for ACL 2024 main conference paper "Can We Achieve High-quality Direct Speech-to-Speech Translation Without Parallel Speech Data?".☆26Jul 2, 2024Updated last year
- The code for AAAI 2025 “Large Language Models Are Read/Write Policy-Makers for Simultaneous Generation”☆15Jan 3, 2025Updated last year
- SiLLM is a Simultaneous Machine Translation (SiMT) Framework. It utilizes a Large Language model as the translation model and employs a t…☆18Feb 22, 2024Updated 2 years ago
- The code of ACL2022 paper "Conditional Bilingual Mutual Information based Adaptive Training for Neural Machine Translation"..☆14Aug 6, 2022Updated 3 years ago
- Streamable Text-to-Speech model using a language modeling approach, without vector quantization☆110May 20, 2025Updated 10 months ago
- Source Code for ACL2019 paper <Bridging the Gap between Training and Inference for Neural Machine Translation>☆41Nov 10, 2020Updated 5 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- The official repository for SkyLadder: Better and Faster Pretraining via Context Window Scheduling☆42Dec 29, 2025Updated 3 months ago
- [ACL 2024] An easily extensible framework for simultaneous, text-to-text neural machine translation (SimulMT) for LLMs.☆18Apr 21, 2025Updated 11 months ago
- MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning☆361Aug 7, 2024Updated last year
- “百聆”是一个基于LLaMA的语言对齐增强的英语/中文大语言模型,具有优越的英语/中文能力,在多语言和通用任务等多项测试中取得ChatGPT 90%的性能。BayLing is an English/Chinese LLM equipped with advanced l…☆318Dec 3, 2024Updated last year
- EMNLP 2022: ClidSum: A Benchmark Dataset for Cross-Lingual Dialogue Summarization☆36Jan 13, 2024Updated 2 years ago
- Official implementation of Self-Taught Agentic Long Context Understanding (ACL 2025).☆13Sep 22, 2025Updated 6 months ago
- [NeurIPS 2024] Can LLMs Learn by Teaching for Better Reasoning? A Preliminary Study☆59Nov 24, 2024Updated last year
- An extention to the GaLore paper, to perform Natural Gradient Descent in low rank subspace☆18Oct 21, 2024Updated last year
- Minimal implementation of TokenFormer for inference and learning☆13Nov 6, 2024Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- Code for ACL 2023 main conference paper "Understanding and Bridging the Modality Gap for Speech Translation".☆17Oct 25, 2023Updated 2 years ago
- Reference implementation of "Softmax Attention with Constant Cost per Token" (Heinsen, 2024)☆24Jun 6, 2024Updated last year
- Source code for the EMNLP 2020 long paper <Token-level Adaptive Training for Neural Machine Translation>.☆20Oct 28, 2022Updated 3 years ago
- Interface Design for Self-Supervised Speech Models, Accepted to Interspeech2024☆16Nov 19, 2024Updated last year
- ☆13Apr 1, 2026Updated last week
- [ICLR 2025] SDTT: a simple and effective distillation method for discrete diffusion models☆48Feb 26, 2026Updated last month
- [ICML'24 Oral] The official code of "DiJiang: Efficient Large Language Models through Compact Kernelization", a novel DCT-based linear at…☆103Jun 14, 2024Updated last year
- Sparse Backpropagation for Mixture-of-Expert Training☆29Jul 2, 2024Updated last year
- The official repo of continuous speculative decoding☆32Mar 28, 2025Updated last year
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- an easy-to-use knn-mt toolkit☆104Aug 19, 2023Updated 2 years ago
- ☆14Oct 17, 2024Updated last year
- Source code for the AAAI 2020 long paper <Modeling Fluency and Faithfulness for Diverse Neural Machine Translation>.☆19Mar 10, 2020Updated 6 years ago
- A reimplementation of KOSMOS-1 from "Language Is Not All You Need: Aligning Perception with Language Models"☆27Mar 3, 2023Updated 3 years ago
- Code for the INTERSPEECH 2023 paper "Learning When to Speak: Latency and Quality Trade-offs for Simultaneous Speech-to-Speech Translation…☆32Jan 14, 2025Updated last year
- [EMNLP2022] Source code for Neural Machine Translation with Contrastive Translation Memories☆12Feb 15, 2023Updated 3 years ago
- Code for the paper "Function-Space Learning Rates"☆25Jun 3, 2025Updated 10 months ago