DocTron-hub / DocTron-FormulaLinks
☆61Updated 5 months ago
Alternatives and similar repositories for DocTron-Formula
Users that are interested in DocTron-Formula are comparing it to the libraries listed below
Sorting:
- An awesome gpu tasks scheduler. 轻量好用的GPU机群任务调度工具。觉得有用可以点个star☆196Updated 3 years ago
- The official repository for the Scientific Paper Idea Proposer (SciPIP)☆67Updated 11 months ago
- ZO2 (Zeroth-Order Offloading): Full Parameter Fine-Tuning 175B LLMs with 18GB GPU Memory [COLM2025]☆199Updated 6 months ago
- (ICLR 2026) Unveiling Super Experts in Mixture-of-Experts Large Language Models☆35Updated 4 months ago
- 通过阿里云盘,colab,国内下载huggingface大模型轻轻松松☆38Updated last year
- The test of different distributed-training methods on High-Flyer AIHPC☆26Updated 3 years ago
- Parameter-Efficient Fine-Tuning for Foundation Models☆109Updated 10 months ago
- Build a daily academic subscription pipeline! Get daily Arxiv papers and corresponding chatGPT summaries with pre-defined keywords. It is…☆46Updated 2 years ago
- ☆135Updated 11 months ago
- ☆29Updated last year
- 本项目提供了基于910B的huggingface LLM模型的Tensor Parallel(TP)部署教程,同时也可以作为一份极简的TP学习代码。☆30Updated 3 weeks ago
- Download the source latex code of multiple arXiv paper with one click☆113Updated 2 years ago
- ☆74Updated 8 months ago
- [ACL 2025 Main] Multi-Agent System for Science of Science☆121Updated 6 months ago
- Rethinking RL Scaling for Vision Language Models: A Transparent, From-Scratch Framework and Comprehensive Evaluation Scheme☆147Updated 9 months ago
- Token level visualization tools for large language models☆91Updated last year
- This is the official implementation for "AUTOPR: LET'S AUTOMATE YOUR ACADEMIC PROMOTION!".☆94Updated 3 months ago
- This project aims to collect and collate various datasets for multimodal large model training, including but not limited to pre-training …☆66Updated 8 months ago
- Arxiv个性化定制化模版,实现对特定领域的相关内容、作者与学术会议的有效跟进。☆332Updated this week
- Scaling Preference Data Curation via Human-AI Synergy☆137Updated 6 months ago
- ☆187Updated 11 months ago
- 论文辅助工具☆46Updated last month
- Delta-CoMe can achieve near loss-less 1-bit compressin which has been accepted by NeurIPS 2024☆58Updated last year
- 青稞Talk☆190Updated last week
- ☆173Updated last week
- DeepSpeed教程 & 示例注释 & 学习笔记 (大模型高效训练)☆186Updated 2 years ago
- The official GitHub page for the survey paper "A Survey of RWKV".☆30Updated last year
- Self Reproduction Code of Paper "Reducing Transformer Key-Value Cache Size with Cross-Layer Attention (MIT CSAIL)☆18Updated last year
- 2025.01:从零到一实现了一个多模态大模型,并命名为Reyes(睿视),R:睿,eyes:眼。Reyes的参数量为8B,视觉编码器使用的是InternViT-300M-448px-V2_5,语言模型侧使用的是Qwen2.5-7B-Instruct,Reyes也通过一个两…☆29Updated this week
- mllm-npu: training multimodal large language models on Ascend NPUs☆95Updated last year