hhnqqq / MyTransformersLinks
This repository provides a comprehensive library for parallel training and LoRA algorithm implementations, supporting multiple parallel strategies and a rich collection of LoRA variants. It serves as a flexible and efficient model fine-tuning toolkit for researchers and developers. Please contact hehn@mail.ustc.edu.cn for detailed information.
☆53Updated 3 weeks ago
Alternatives and similar repositories for MyTransformers
Users that are interested in MyTransformers are comparing it to the libraries listed below
Sorting:
- A Collection of Papers on Diffusion Language Models☆152Updated 4 months ago
- ☆137Updated last week
- A tiny paper rating web☆38Updated 10 months ago
- 📖 This is a repository for organizing papers, codes, and other resources related to unified multimodal models.☆347Updated 3 weeks ago
- Paper List of Inference/Test Time Scaling/Computing☆342Updated 5 months ago
- [TMLR 2025] Efficient Reasoning Models: A Survey☆296Updated 3 weeks ago
- paper list, tutorial, and nano code snippet for Diffusion Large Language Models.☆152Updated last week
- Long-RL: Scaling RL to Long Sequences (NeurIPS 2025)☆687Updated 4 months ago
- The official repository for the paper "ThinkMorph: Emergent Properties in Multimodal Interleaved Chain-of-Thought Reasoning"☆140Updated 3 weeks ago
- One-shot Entropy Minimization☆188Updated 7 months ago
- Interleaving Reasoning: Next-Generation Reasoning Systems for AGI☆240Updated 3 months ago
- TraceRL & TraDo-8B: Revolutionizing Reinforcement Learning Framework for Diffusion Large Language Models☆398Updated last month
- A paper list of Awesome Latent Space.☆305Updated last week
- ScalingOpt - Optimization Community☆76Updated last week
- Discrete Diffusion Forcing (D2F): dLLMs Can Do Faster-Than-AR Inference☆238Updated 2 weeks ago
- ✈️ [ICCV 2025] Towards Stabilized and Efficient Diffusion Transformers through Long-Skip-Connections with Spectral Constraints☆80Updated 6 months ago
- ☆308Updated last month
- Code release for VTW (AAAI 2025 Oral)☆64Updated 2 months ago
- ☆168Updated 2 months ago
- (CVPR 2025) PyramidDrop: Accelerating Your Large Vision-Language Models via Pyramid Visual Redundancy Reduction☆141Updated 10 months ago
- Doodling our way to AGI ✏️ 🖼️ 🧠☆120Updated 8 months ago
- A python script for downloading huggingface datasets and models.☆20Updated 9 months ago
- Official PyTorch implementation of the paper "dLLM-Cache: Accelerating Diffusion Large Language Models with Adaptive Caching" (dLLM-Cache…☆197Updated 2 months ago
- Official repository for VisionZip (CVPR 2025)☆403Updated 6 months ago
- [ICLR 2025] Dynamic Mixture of Experts: An Auto-Tuning Approach for Efficient Transformer Models☆152Updated 6 months ago
- [Survey] Next Token Prediction Towards Multimodal Intelligence: A Comprehensive Survey☆472Updated last year
- [EMNLP 2025 main 🔥] Code for "Stop Looking for Important Tokens in Multimodal Language Models: Duplication Matters More"☆100Updated 3 months ago
- Official Code for "Mini-o3: Scaling Up Reasoning Patterns and Interaction Turns for Visual Search"☆394Updated 4 months ago
- ☆162Updated last week
- [ICML'25] Official implementation of paper "SparseVLM: Visual Token Sparsification for Efficient Vision-Language Model Inference" and "Sp…☆231Updated last month