Fast LLM Training CodeBase With dynamic strategy choosing [Deepspeed+Megatron+FlashAttention+CudaFusionKernel+Compiler];
☆40Jan 4, 2024Updated 2 years ago
Alternatives and similar repositories for FastLLM
Users that are interested in FastLLM are comparing it to the libraries listed below
Sorting:
- Reproduction of the complete process of DeepSeek-R1 on small-scale models, including Pre-training, SFT, and RL.☆29Mar 11, 2025Updated 11 months ago
- MedGen: Unlocking Medical Video Generation by Scaling Granularly-annotated Medical Videos.☆30Jul 9, 2025Updated 7 months ago
- ☆11May 9, 2022Updated 3 years ago
- Built upon Megatron-Deepspeed and HuggingFace Trainer, EasyLLM has reorganized the code logic with a focus on usability. While enhancing …☆49Sep 18, 2024Updated last year
- Our 2nd-gen LMM☆34May 22, 2024Updated last year
- Agently Stage - Efficient Convenient Asynchronous & Multithreaded Programming☆13Apr 2, 2025Updated 10 months ago
- DeepTrace: A lightweight, scalable real-time diagnostic and analysis tool for distributed training tasks.☆18Nov 4, 2025Updated 3 months ago
- [ACL 2025] Exploring Compositional Generalization of Multimodal LLMs for Medical Imaging☆39Jun 4, 2025Updated 8 months ago
- This tool(enhance_long) aims to enhance the LlaMa2 long context extrapolation capability in the lowest-cost approach, preferably without …☆45Nov 30, 2023Updated 2 years ago
- This project aims to collect and collate various datasets for multimodal large model training, including but not limited to pre-training …☆67May 7, 2025Updated 9 months ago
- [NAACL 2025] Representing Rule-based Chatbots with Transformers☆23Feb 9, 2025Updated last year
- ☆48Jan 20, 2026Updated last month
- [ICLR'25] ApolloMoE: Efficiently Democratizing Medical LLMs for 50 Languages via a Mixture of Language Family Experts☆52Nov 20, 2024Updated last year
- 百川Dynamic NTK-ALiBi的代码实现:无需微调即可推理更长文本☆49Aug 27, 2023Updated 2 years ago
- LongLLaVA: Scaling Multi-modal LLMs to 1000 Images Efficiently via Hybrid Architecture☆213Jan 6, 2025Updated last year
- ☆20Jan 6, 2023Updated 3 years ago
- Code for COLING22 paper, DPTDR: Deep Prompt Tuning for Dense Passage Retrieval☆26Aug 7, 2023Updated 2 years ago
- DELT: Data Efficacy for Language Model Training☆43Feb 12, 2026Updated 2 weeks ago
- The AI paper weekly report automation project in collaboration between Agent Universe and Kimi.☆23Aug 3, 2024Updated last year
- Modality Gap–Driven Subspace Alignment Training Paradigm For Multimodal Large Language Models☆51Updated this week
- ☆29Sep 17, 2024Updated last year
- 模型压缩的小白入门教程☆22Jul 7, 2024Updated last year
- Code and data for COLING 2022 paper titled "Structural Bias For Aspect Sentiment Triplet Extraction"☆26May 28, 2023Updated 2 years ago
- DEYOv1.5☆29Jul 22, 2024Updated last year
- ICLR 2025☆31May 21, 2025Updated 9 months ago
- [NeurIPS 2025] Elevating Visual Perception in Multimodal LLMs with Visual Embedding Distillation☆71Oct 17, 2025Updated 4 months ago
- Scaling Multi-modal Instruction Fine-tuning with Tens of Thousands Vision Task Types☆32Jul 16, 2025Updated 7 months ago
- support BM25+vecetor☆29May 26, 2025Updated 9 months ago
- Reproduction of LLaVA-v1.5 based on Llama-3-8b LLM backbone.☆65Oct 25, 2024Updated last year
- a family of highly capabale yet efficient large multimodal models☆191Aug 23, 2024Updated last year
- [TMLR 25] SFT or RL? An Early Investigation into Training R1-Like Reasoning Large Vision-Language Models☆149Oct 10, 2025Updated 4 months ago
- ☆72Apr 2, 2024Updated last year
- ☆34Feb 6, 2026Updated 3 weeks ago
- A collection of different PyTorch wrappers for training neural networks and reinforcement algorithms☆13Dec 15, 2022Updated 3 years ago
- Towards Systematic Measurement for Long Text Quality☆37Sep 5, 2024Updated last year
- Segmentation-Based Deep-Learning Approach for Surface-Defect Detection☆27Dec 2, 2020Updated 5 years ago
- This repository provides an implementation of "A Simple yet Effective Training-free Prompt-free Approach to Chinese Spelling Correction B…☆86Jul 9, 2025Updated 7 months ago
- Parameter-Efficient Fine-Tuning for Foundation Models☆110Mar 31, 2025Updated 11 months ago
- ☆134Feb 17, 2025Updated last year