A curated list of papers, tools, and resources on Multi-Token Prediction (MTP) and related techniques in Large Language Models (LLMs), Speech-Language Models (SLMs), and more.
☆149Jun 13, 2026Updated this week
Alternatives and similar repositories for Awesome-Multi-Token-Prediction
Users that are interested in Awesome-Multi-Token-Prediction are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [MM 2025] Towards Modality Generalization: A Benchmark and Prospective Analysis☆28May 22, 2025Updated last year
- Implementation of our paper, "MM-Forecast: A Multimodal Approach to Temporal Event Forecasting with Large Language Models".☆18Apr 16, 2025Updated last year
- The repository of paper Personalized Multimodal Response Generation with Large Language Models☆18Jun 28, 2024Updated last year
- A curated list of Vision (video/image) to Audio Generation☆105Feb 10, 2026Updated 4 months ago
- FlexiTokens☆23Dec 27, 2025Updated 5 months ago
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- The official github repo for MixEval-X, the first any-to-any, real-world benchmark.☆17Feb 15, 2025Updated last year
- Comparison of existing spell checking tools☆11Mar 28, 2023Updated 3 years ago
- The implementation of paper "EliMRec: Eliminating single-modal bias in multimedia recommendation", MM'22.☆22Dec 7, 2023Updated 2 years ago
- Diffusion Models for Generative Outfit Recommendation☆40Sep 11, 2024Updated last year
- the code for paper: A Symmetric Dual Encoding Dense Retrieval Framework for Knowledge-Intensive Visual Question Answering☆14Aug 22, 2023Updated 2 years ago
- Localization of Knowledge in Text-to-Image Models☆12Oct 8, 2024Updated last year
- [NeurIPS 2024] The implementation of paper "On Softmax Direct Preference Optimization for Recommendation"☆101Nov 29, 2024Updated last year
- [AAAI26] Trade-offs in Large Reasoning Models: An Empirical Analysis of Deliberative and Adaptive Reasoning over Foundational Capabilitie…☆11Feb 7, 2026Updated 4 months ago
- GPT Demo with hybrid distributed training☆10Dec 1, 2022Updated 3 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- BPE modification that implements removing of the intermediate tokens during tokenizer training.☆27Nov 25, 2024Updated last year
- ☆18Jun 8, 2026Updated last week
- Harmonic-NAS: Hardware-Aware Multimodal Neural Architecture Search on Resource-constrained Devices (ACML 2023)☆16May 7, 2024Updated 2 years ago
- Marathon: A Multiple-choice Long Context Evaluation Benchmark for Large Language Models.☆10May 16, 2024Updated 2 years ago
- Data-efficient Fine-tuning for LLM-based Recommendation (SIGIR'24)☆40Feb 21, 2025Updated last year
- ☆10Sep 13, 2022Updated 3 years ago
- Implementation for "DeltaPhi: Learning Physical Trajectory Residual for PDE Solving"☆13Jun 17, 2024Updated 2 years ago
- Offical implementation of our paper "Exploring the Potential of Diffusion Large Language Models in Code Generation".☆23Oct 29, 2025Updated 7 months ago
- [ICML 2025] Official PyTorch implementation of "NegMerge: Sign-Consensual Weight Merging for Machine Unlearning"☆16Nov 25, 2025Updated 6 months ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- An Tensorflow.keras implementation of Same, Same But Different - Recovering Neural Network Quantization Error Through Weight Factorizatio…☆10Dec 18, 2019Updated 6 years ago
- [NeurIPS 2025] The implementation of paper "On Reasoning Strength Planning in Large Reasoning Models"☆32Jul 6, 2025Updated 11 months ago
- [TCSVT23] Official code for "SPT: Spatial Pyramid Transformer for Image Captioning".☆10Aug 14, 2024Updated last year
- Mixture of Experts from scratch☆14Apr 12, 2024Updated 2 years ago
- ☆16Jul 2, 2022Updated 3 years ago
- Code for EMNLP2023 paper "MolCA: Molecular Graph-Language Modeling with Cross-Modal Projector and Uni-Modal Adapter".☆12Dec 27, 2023Updated 2 years ago
- Learning Accurate Decision Trees with Bandit Feedback via Quantized Gradient Descent☆16Sep 8, 2022Updated 3 years ago
- [CVPR 2022] DiSparse: Disentangled Sparsification for Multitask Model Compression☆14Sep 6, 2022Updated 3 years ago
- ☆23Sep 28, 2023Updated 2 years ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Code for AAAI 2021 long paper Learning from Crowds by Modeling Common Confusions.☆11Feb 6, 2021Updated 5 years ago
- Code needed to reproduce results from my ICLR 2019 paper on fixed-point quantization of the backprop algorithm.☆10Jan 24, 2019Updated 7 years ago
- PyTorch code for full quantization of DNN using BCGD☆14Jul 24, 2019Updated 6 years ago
- Code and dataset release for "PACS: A Dataset for Physical Audiovisual CommonSense Reasoning" (ECCV 2022)☆18Dec 20, 2022Updated 3 years ago
- [CVPR 2024] Official repository of ST_GT☆10Sep 15, 2024Updated last year
- [NeurIPS 2024 Spotlight] CLIPLoss and Norm-Based Data Selection Methods for Multimodal Contrastive Learning.☆14Dec 12, 2024Updated last year
- [ICLR 2026] The implementation of paper "AlphaSteer: Learning Refusal Steering with Principled Null-Space Constraint"☆59Nov 20, 2025Updated 6 months ago