Simple and efficient DeepSeek V3 SFT using pipeline parallel and expert parallel, with both FP8 and BF16 trainings
☆115Jul 27, 2025Updated 7 months ago
Alternatives and similar repositories for pipelining-sft
Users that are interested in pipelining-sft are comparing it to the libraries listed below
Sorting:
- coded with and corrected by Google Anti-Gravity☆13Nov 23, 2025Updated 3 months ago
- A scalable asynchronous reinforcement learning implementation with in-flight weight updates.☆373Feb 26, 2026Updated last week
- Python package for compressing floating-point PyTorch tensors☆13Jul 22, 2024Updated last year
- A fast, local, and secure approach for training LLMs for coding tasks using GRPO with WebAssembly and interpreter feedback.☆41Apr 4, 2025Updated 11 months ago
- ☆43Jan 27, 2026Updated last month
- A FREE comprehensive step-by-step 8-bit ATmega328P C and Assembler tutorial covering Embedded Software Development to Reverse Engineering…☆11Nov 26, 2025Updated 3 months ago
- CIKM 2022: Evaluating Interpolation and Extrapolation Performance of Neural Retrieval Models☆11Aug 4, 2022Updated 3 years ago
- Code for "Consistent Estimators for Learning to Defer to an Expert" (ICML 2020)☆16Jan 28, 2023Updated 3 years ago
- ⚔️ [ICLR 2026] Official code of "Search Arena: Analyzing Search-Augmented LLMs".☆52Feb 23, 2026Updated 2 weeks ago
- Software relating to relational empirical risk minimization☆17Jun 12, 2021Updated 4 years ago
- j1-micro (1.7B) & j1-nano (600M) are absurdly tiny but mighty reward models.☆102Jul 19, 2025Updated 7 months ago
- ☆14Jun 25, 2025Updated 8 months ago
- A framework for majority vote classifiers allowing for computation of PAC Bayesian risk bounds.☆14Feb 9, 2023Updated 3 years ago
- creditmodel, 模型,评分卡,scorecard, vintage, automatic modeling☆11Aug 10, 2024Updated last year
- Official implementation for "Law of the Weakest Link: Cross capabilities of Large Language Models"☆43Oct 1, 2024Updated last year
- This website contains the python code accompanying the book "Mathematical Foundations of Deep Learning Models and Algorithms" by Konstant…☆48Nov 24, 2025Updated 3 months ago
- Multi-Agent Verification: Scaling Test-Time Compute with Multiple Verifiers☆28Mar 1, 2025Updated last year
- Train speculative decoding models effortlessly and port them smoothly to SGLang serving.☆716Feb 28, 2026Updated last week
- Source code for EMNLP findings paper "Open-Vocabulary Argument Role Prediction for Event Extraction"☆19Nov 5, 2022Updated 3 years ago
- ☆18Feb 25, 2026Updated last week
- Fluid Language Model Benchmarking☆26Sep 16, 2025Updated 5 months ago
- Example of how to use R in Jupyter notebooks and make compatible with Binder☆17Feb 25, 2019Updated 7 years ago
- noise reduction☆17Jul 3, 2024Updated last year
- Manages vllm-nccl dependency☆17Jun 3, 2024Updated last year
- This library supports evaluating disparities in generated image quality, diversity, and consistency between geographic regions.☆20Jun 3, 2024Updated last year
- Simple and efficient pytorch-native transformer training and inference (batched)☆79Apr 2, 2024Updated last year
- Codebase for running (conditional) probing experiments☆22Nov 13, 2022Updated 3 years ago
- Open Source Speech/Text Data on AI☆19Sep 13, 2022Updated 3 years ago
- Website with current metrics on the fastest AI models.☆43Nov 13, 2024Updated last year
- Inference-time scaling for LLMs-as-a-judge.☆330Nov 5, 2025Updated 4 months ago
- A FREE course that takes you step-by-step through building a custom Automation Ansible Framework from scratch.☆22Nov 26, 2025Updated 3 months ago
- Prompt based agentic developer primitives☆36Jun 17, 2025Updated 8 months ago
- Featurized Density Ratio Estimation☆20Jul 11, 2021Updated 4 years ago
- TreeRL: LLM Reinforcement Learning with On-Policy Tree Search in ACL'25☆89Jun 16, 2025Updated 8 months ago
- ☆25May 28, 2025Updated 9 months ago
- ☆23Oct 17, 2024Updated last year
- ☆93Jul 5, 2024Updated last year
- One-stop shop for running and fine-tuning transformer-based language models for retrieval☆63Mar 2, 2026Updated last week
- Train your own SOTA deductive reasoning model☆107Mar 6, 2025Updated last year