☆64May 23, 2026Updated last week
Alternatives and similar repositories for llm_trainer
Users that are interested in llm_trainer are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Implement llm model in pytorch, support MoE and RoPE☆68May 18, 2026Updated last week
- 从零构建大模型:从预训练到RLHF的完整实践☆2,658May 20, 2026Updated last week
- EagleVision: Object-level Attribute Multimodal LLM for Remote Sensing☆25May 29, 2025Updated last year
- naïve blockchain in Rust☆10Nov 13, 2020Updated 5 years ago
- Zeta implementation of a reusable and plug in and play feedforward from the paper "Exponentially Faster Language Modeling"☆16Nov 11, 2024Updated last year
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- Tiny-DeepSpeed, a minimalistic re-implementation of the DeepSpeed library☆52Aug 20, 2025Updated 9 months ago
- A Sample Code Project for ASP.NET 5 with Dapr☆13Apr 18, 2021Updated 5 years ago
- ☆28Dec 11, 2025Updated 5 months ago
- Reproduced the DFT method without using Verl. https://arxiv.org/abs/2508.05629☆23Oct 14, 2025Updated 7 months ago
- This is the official repo for the paper "AMO-Bench: Large Language Models Still Struggle in High School Math Competitions".☆126Feb 6, 2026Updated 3 months ago
- Experiments with reasoning models, training techniques, papers☆30Updated this week
- java implementation of Bert Tokenizer, support output onnx tensor for onnx model inference☆13Sep 4, 2023Updated 2 years ago
- 实现《Multiway Attention Networks for Modeling Sentence Pairs》中的网络模型,可用于问答,句子逻辑推理☆11Apr 13, 2020Updated 6 years ago
- ☆15Apr 23, 2026Updated last month
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- The implementation of Text Classification with Negative Supervision (ACL, 2020)☆10Oct 8, 2020Updated 5 years ago
- ☆10Jan 12, 2024Updated 2 years ago
- ☆13Sep 25, 2021Updated 4 years ago
- MetaSearch:llm深度研究(deepsearch)功能方案实现☆33Aug 21, 2025Updated 9 months ago
- Experimental syslog template mining module☆11Aug 29, 2016Updated 9 years ago
- Multiagent optimization system (MAOS) for solving the Traveling Salesman Problem (TSP).☆12Aug 7, 2019Updated 6 years ago
- Methods and experiments for assumed density SDE approximations☆12Jan 26, 2022Updated 4 years ago
- Taylor moment expansion in Python (JaX and SymPy) and Matlab☆11Nov 26, 2024Updated last year
- [ICLR 2022] Denoising Likelihood Score Matching for Conditional Score-based Data Generation