DocTron-hub / DocTron-FormulaLinks
☆61Updated 4 months ago
Alternatives and similar repositories for DocTron-Formula
Users that are interested in DocTron-Formula are comparing it to the libraries listed below
Sorting:
- ZO2 (Zeroth-Order Offloading): Full Parameter Fine-Tuning 175B LLMs with 18GB GPU Memory [COLM2025]☆198Updated 5 months ago
- An awesome gpu tasks scheduler. 轻量好用的GPU机群任务调度工具。觉得有用可以点个star☆194Updated 3 years ago
- ☆136Updated 10 months ago
- [ICML 2025 Oral] Mixture of Lookup Experts☆56Updated 2 weeks ago
- A minimal PyTorch re-implementation of Qwen3 VL with a fancy CLI☆287Updated 2 weeks ago
- 青稞Talk☆175Updated last week
- The official repository of the dots.vlm1 instruct models proposed by rednote-hilab.☆275Updated 2 months ago
- ☆28Updated last year
- Triton Documentation in Chinese Simplified / Triton 中文文档☆95Updated this week
- ☆187Updated 10 months ago
- ☆108Updated last month
- The test of different distributed-training methods on High-Flyer AIHPC☆26Updated 3 years ago
- The official repository for the Scientific Paper Idea Proposer (SciPIP)☆66Updated 9 months ago
- This project aims to collect and collate various datasets for multimodal large model training, including but not limited to pre-training …☆60Updated 7 months ago
- mllm-npu: training multimodal large language models on Ascend NPUs☆94Updated last year
- Skywork-MoE: A Deep Dive into Training Techniques for Mixture-of-Experts Language Models☆137Updated last year
- Parameter-Efficient Fine-Tuning for Foundation Models☆103Updated 8 months ago
- Scaling Preference Data Curation via Human-AI Synergy☆132Updated 5 months ago
- Efficient Mixture of Experts for LLM Paper List☆149Updated 2 months ago
- qwen-nsa☆85Updated 2 months ago
- Rethinking RL Scaling for Vision Language Models: A Transparent, From-Scratch Framework and Comprehensive Evaluation Scheme☆146Updated 8 months ago
- Delta-CoMe can achieve near loss-less 1-bit compressin which has been accepted by NeurIPS 2024☆59Updated last year
- Unveiling Super Experts in Mixture-of-Experts Large Language Models☆32Updated 2 months ago
- Build a daily academic subscription pipeline! Get daily Arxiv papers and corresponding chatGPT summaries with pre-defined keywords. It is…☆47Updated 2 years ago
- Datasets, Transforms and Models specific to Computer Vision☆90Updated 2 years ago
- ☆18Updated 3 years ago
- PaperHelper: Knowledge-Based LLM QA Paper Reading Assistant with Reliable References☆20Updated last year
- ☆61Updated last year
- ☆112Updated 6 months ago
- Token level visualization tools for large language models☆90Updated 11 months ago