A curated list of papers, tools, and resources on Multi-Token Prediction (MTP) and related techniques in Large Language Models (LLMs), Speech-Language Models (SLMs), and more.
☆125May 25, 2026Updated this week
Alternatives and similar repositories for Awesome-Multi-Token-Prediction
Users that are interested in Awesome-Multi-Token-Prediction are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [MM 2025] Towards Modality Generalization: A Benchmark and Prospective Analysis☆28May 22, 2025Updated last year
- [NeurIPS 2025] L-MTP: Leap Multi-Token Prediction Beyond Adjacent Context for Large Language Models☆28May 8, 2026Updated 3 weeks ago
- [KDD 2025] Fine-tuning Multimodal Large Language Models for Product Bundling☆15Sep 20, 2025Updated 8 months ago
- Improving large language models with concept-aware fine-tuning (CAFT)☆29Jan 31, 2026Updated 3 months ago
- The repository of paper Personalized Multimodal Response Generation with Large Language Models☆18Jun 28, 2024Updated last year
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- A curated list of Vision (video/image) to Audio Generation☆105Feb 10, 2026Updated 3 months ago
- Official code of "Invariant Collaborative Filtering to Popularity Distribution Shift" (2023 WWW)☆21Jul 27, 2023Updated 2 years ago
- FlexiTokens☆23Dec 27, 2025Updated 5 months ago
- ☆21Apr 3, 2026Updated last month
- Comparison of existing spell checking tools☆11Mar 28, 2023Updated 3 years ago
- The implementation of paper "EliMRec: Eliminating single-modal bias in multimedia recommendation", MM'22.☆22Dec 7, 2023Updated 2 years ago
- Diffusion Models for Generative Outfit Recommendation☆39Sep 11, 2024Updated last year
- the code for paper: A Symmetric Dual Encoding Dense Retrieval Framework for Knowledge-Intensive Visual Question Answering☆13Aug 22, 2023Updated 2 years ago
- [KDD'25] LLM2Rec: Large Language Models Are Powerful Embedding Models for Sequential Recommendation.☆65Sep 6, 2025Updated 8 months ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Localization of Knowledge in Text-to-Image Models☆12Oct 8, 2024Updated last year
- GPT Demo with hybrid distributed training☆10Dec 1, 2022Updated 3 years ago
- BPE modification that implements removing of the intermediate tokens during tokenizer training.☆27Nov 25, 2024Updated last year
- ☆18May 19, 2026Updated last week
- Harmonic-NAS: Hardware-Aware Multimodal Neural Architecture Search on Resource-constrained Devices (ACML 2023)☆16May 7, 2024Updated 2 years ago
- Marathon: A Multiple-choice Long Context Evaluation Benchmark for Large Language Models.☆10May 16, 2024Updated 2 years ago
- ☆14Jun 24, 2024Updated last year
- ☆10Sep 13, 2022Updated 3 years ago
- Implementation for "DeltaPhi: Learning Physical Trajectory Residual for PDE Solving"☆13Jun 17, 2024Updated last year
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Official source code of the INTERSPEECH 2023 paper: "Audio-Visual Speech Separation in Noisy Environments with a Lightweight Iterative Mo…☆20Sep 1, 2023Updated 2 years ago
- Implementation of our paper, Your Negative May not Be True Negative: Boosting Image-Text Matching with False Negative Elimination..☆20Dec 3, 2023Updated 2 years ago
- Pytorch implementation of our paper: Audio-Visual Speech Separation with Visual Features Enhanced by Adversarial Training.☆18Jul 11, 2022Updated 3 years ago
- Offical implementation of our paper "Exploring the Potential of Diffusion Large Language Models in Code Generation".☆22Oct 29, 2025Updated 7 months ago
- An Tensorflow.keras implementation of Same, Same But Different - Recovering Neural Network Quantization Error Through Weight Factorizatio…☆10Dec 18, 2019Updated 6 years ago
- ☆18May 25, 2023Updated 3 years ago
- Mixture of Experts from scratch☆14Apr 12, 2024Updated 2 years ago
- Explore, Establish, Exploit: Red Teaming Language Models from Scratch☆15Jun 21, 2023Updated 2 years ago
- Federated Conformal Prediction with Quantile-of-Quantiles (FedCP-QQ)☆11May 6, 2026Updated 3 weeks ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- ☆23Sep 28, 2023Updated 2 years ago
- Contrastive Continual Learning with Importance Sampling and Prototype-Instance Relation Distillation☆12Jul 22, 2024Updated last year
- Code for AAAI 2021 long paper Learning from Crowds by Modeling Common Confusions.☆11Feb 6, 2021Updated 5 years ago
- Code needed to reproduce results from my ICLR 2019 paper on fixed-point quantization of the backprop algorithm.☆10Jan 24, 2019Updated 7 years ago
- The implementation of paper "Self-supervised learning for multimedia recommendation", TMM'22.☆10Jul 4, 2022Updated 3 years ago
- Code and dataset release for "PACS: A Dataset for Physical Audiovisual CommonSense Reasoning" (ECCV 2022)☆18Dec 20, 2022Updated 3 years ago
- [CVPR 2024] Official repository of ST_GT☆10Sep 15, 2024Updated last year