GeeeekExplorer / 3d-parallel-demo
使用torch.distributed包实现DP/TP/PP
☆10Updated last year
Alternatives and similar repositories for 3d-parallel-demo:
Users that are interested in 3d-parallel-demo are comparing it to the libraries listed below
- a survey of long-context LLMs from four perspectives, architecture, infrastructure, training, and evaluation☆45Updated last week
- ☆33Updated last year
- A simple toolkit for benchmarking LLMs on mathematical reasoning tasks. 🧮✨☆191Updated 11 months ago
- Awesome-Long2short-on-LRMs is a collection of state-of-the-art, novel, exciting long2short methods on large reasoning models. It contains…☆171Updated this week
- Reproducing R1 for Code with Reliable Rewards☆140Updated 3 weeks ago
- [ICLR 2025] Dynamic Mixture of Experts: An Auto-Tuning Approach for Efficient Transformer Models☆80Updated last month
- This repository contains a regularly updated paper list for LLMs-reasoning-in-latent-space.☆67Updated this week
- Chain of Thoughts (CoT) is so hot! so long! We need short reasoning process!☆43Updated 2 weeks ago
- A versatile toolkit for applying Logit Lens to modern large language models (LLMs). Currently supports Llama-3.1-8B and Qwen-2.5-7B, enab…☆65Updated last month
- 🔥 How to efficiently and effectively compress the CoTs or directly generate concise CoTs during inference while maintaining the reasonin…☆23Updated this week
- A Survey of Efficient Reasoning for Large Reasoning Models: Language, Multimodality, and Beyond☆42Updated this week
- [ICLR 2025] Code and Data Repo for Paper "Latent Space Chain-of-Embedding Enables Output-free LLM Self-Evaluation"☆37Updated 3 months ago
- Paper list for Efficient Reasoning.☆331Updated this week
- [ICLR 2025] PEARL: Parallel Speculative Decoding with Adaptive Draft Length☆62Updated 2 weeks ago
- ☆171Updated last month
- A Comprehensive Survey on Long Context Language Modeling☆113Updated this week
- The official repository of "Whoever Started the Interference Should End It: Guiding Data-Free Model Merging via Task Vectors""☆14Updated 2 weeks ago
- The blog, read report and code example for AGI/LLM related knowledge.☆36Updated last month
- SOTA RL fine-tuning solution for advanced math reasoning of LLM☆92Updated this week
- Due to the huge vocaburary size (151,936) of Qwen models, the Embedding and LM Head weights are excessively heavy. Therefore, this projec…☆17Updated 7 months ago
- A research repo for experiments about Reinforcement Finetuning☆37Updated last week
- ☆32Updated 5 months ago
- TokenSkip: Controllable Chain-of-Thought Compression in LLMs☆103Updated 2 weeks ago
- ☆185Updated 5 months ago
- ☆108Updated this week
- ☆170Updated 8 months ago
- Implementation code for ACL2024:Advancing Parameter Efficiency in Fine-tuning via Representation Editing☆13Updated 11 months ago
- Code associated with the paper **Draft & Verify: Lossless Large Language Model Acceleration via Self-Speculative Decoding**☆176Updated last month
- Curation of resources for LLM research, screened by @tongyx361 to ensure high quality and accompanied with elaborately-written concise de…☆49Updated 8 months ago
- ☆49Updated last month