bigai-nlco / TokenSwift
From Hours to Minutes: Lossless Acceleration of Ultra Long Sequence Generation
☆80Updated last week
Alternatives and similar repositories for TokenSwift:
Users that are interested in TokenSwift are comparing it to the libraries listed below
- ☆124Updated 3 weeks ago
- ☆106Updated last month
- [preprint] We propose a novel fine-tuning method, Separate Memory and Reasoning, which combines prompt tuning with LoRA.☆43Updated 3 months ago
- ☆166Updated last month
- Offical Repo for "Programming Every Example: Lifting Pre-training Data Quality Like Experts at Scale"☆229Updated last month
- Exploring the Limit of Outcome Reward for Learning Mathematical Reasoning☆161Updated last week
- Reformatted Alignment☆115Updated 6 months ago
- Research Code for preprint "Optimizing Test-Time Compute via Meta Reinforcement Finetuning".☆74Updated 2 weeks ago
- HelloBench: Evaluating Long Text Generation Capabilities of Large Language Models☆39Updated 4 months ago
- Official code of *Virgo: A Preliminary Exploration on Reproducing o1-like MLLM*☆97Updated last month
- ☆94Updated 3 months ago
- Advancing Language Model Reasoning through Reinforcement Learning and Inference Scaling☆95Updated 2 months ago
- Code for Paper: Teaching Language Models to Critique via Reinforcement Learning☆84Updated last month
- Code for "Critique Fine-Tuning: Learning to Critique is More Effective than Learning to Imitate"☆131Updated last month
- [ICLR 2025] SuperCorrect: Advancing Small LLM Reasoning with Thought Template Distillation and Self-Correction☆64Updated this week
- This is a repo for showcasing using MCTS with LLMs to solve gsm8k problems☆67Updated last week
- [NeurIPS 2024] CharXiv: Charting Gaps in Realistic Chart Understanding in Multimodal LLMs☆100Updated 3 months ago
- ☆73Updated last year
- The demo, code and data of FollowRAG☆70Updated 3 months ago
- ☆101Updated 11 months ago
- Code & Dataset for Paper: "Distill Visual Chart Reasoning Ability from LLMs to MLLMs"☆51Updated 5 months ago
- ☆29Updated 4 months ago
- L1: Controlling How Long A Reasoning Model Thinks With Reinforcement Learning☆162Updated last week
- ☆262Updated last week
- ☆102Updated 3 months ago
- official implementation of paper "Process Reward Model with Q-value Rankings"☆51Updated last month
- Repo of paper "Free Process Rewards without Process Labels"☆138Updated 2 weeks ago
- Web2Code: A Large-scale Webpage-to-Code Dataset and Evaluation Framework for Multimodal LLMs☆75Updated 5 months ago
- MMR1: Advancing the Frontiers of Multimodal Reasoning☆148Updated last week
- MPO: Boosting LLM Agents with Meta Plan Optimization☆40Updated 3 weeks ago