DocTron-hub / DocTron-FormulaLinks
☆61Updated 3 months ago
Alternatives and similar repositories for DocTron-Formula
Users that are interested in DocTron-Formula are comparing it to the libraries listed below
Sorting:
- ☆135Updated 9 months ago
- Unveiling Super Experts in Mixture-of-Experts Large Language Models☆30Updated 2 months ago
- An awesome gpu tasks scheduler. 轻量好用的GPU机群任务调度工具。觉得有用可以点 个star☆193Updated 3 years ago
- The test of different distributed-training methods on High-Flyer AIHPC☆26Updated 3 years ago
- ZO2 (Zeroth-Order Offloading): Full Parameter Fine-Tuning 175B LLMs with 18GB GPU Memory [COLM2025]☆196Updated 4 months ago
- Parameter-Efficient Fine-Tuning for Foundation Models☆100Updated 8 months ago
- 通过阿里云盘,colab,国内下载huggingface大模型轻轻松松☆36Updated last year
- mllm-npu: training multimodal large language models on Ascend NPUs☆94Updated last year
- Build a daily academic subscription pipeline! Get daily Arxiv papers and corresponding chatGPT summaries with pre-defined keywords. It is…☆46Updated 2 years ago
- This project aims to collect and collate various datasets for multimodal large model training, including but not limited to pre-training …☆59Updated 6 months ago
- Download the source latex code of multiple arXiv paper with one click☆112Updated 2 years ago
- ☆28Updated last year
- The official repository of the dots.vlm1 instruct models proposed by rednote-hilab.☆265Updated 2 months ago
- Vision Search Assistant: Empower Vision-Language Models as Multimodal Search Engines☆127Updated last year
- The introduction to cuda, a simple and easy cuda project☆22Updated 3 years ago
- 青稞Talk☆168Updated this week
- DELT: Data Efficacy for Language Model Training☆42Updated 3 months ago
- A minimal, easy-to-read PyTorch reimplementation of the Qwen3 and Qwen2.5 VL with a fancy CLI☆192Updated this week
- The official repository for the Scientific Paper Idea Proposer (SciPIP)☆66Updated 9 months ago
- Open-Pandora: On-the-fly Control Video Generation☆35Updated last year
- Customize your arXiv recommendation every day.☆135Updated 2 months ago
- 从零到一实现了一个多模态大模型,并命名为Reyes(睿视),R:睿,eyes:眼。Reyes的参数量为8B,视觉编码器使用的是InternViT-300M-448px-V2_5,语言模型侧使用的是Qwen2.5-7B-Instruct,Reyes也通过一个两层MLP投影层连…☆27Updated 9 months ago
- Tutorial for Ray☆36Updated last year
- A Survey of Efficient Attention Methods: Hardware-efficient, Sparse, Compact, and Linear Attention☆233Updated 3 months ago
- WanJuan-CC是以CommonCrawl为基础,经过数据抽取,规则清洗,去重,安全过滤,质量清洗等步骤得到的高质量数据。☆15Updated last year
- ☆18Updated 3 years ago
- Official code implementation of Slow Perception:Let's Perceive Geometric Figures Step-by-step☆153Updated 4 months ago
- Scaling Preference Data Curation via Human-AI Synergy☆130Updated 4 months ago
- Rethinking RL Scaling for Vision Language Models: A Transparent, From-Scratch Framework and Comprehensive Evaluation Scheme☆145Updated 7 months ago
- 本项目提供了基于910B的huggingface LLM模型的Tensor Parallel(TP)部署教程,同时也可以作为一份极简的TP学习代码。☆30Updated last year