qibin0506/llm_trainer

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/qibin0506/llm_trainer)

qibin0506 / llm_trainer

☆53

Alternatives and similar repositories for llm_trainer

Users that are interested in llm_trainer are comparing it to the libraries listed below

Sorting:

qibin0506 / llm_model
View on GitHub
Implement llm model in pytorch, support MoE and RoPE
☆41Jan 29, 2026Updated last month
wiomax / MAOS-TSP
View on GitHub
Multiagent optimization system (MAOS) for solving the Traveling Salesman Problem (TSP).
☆12Aug 7, 2019Updated 6 years ago
nmboffi / sbtm
View on GitHub
Repository for score-based transport modeling.
☆11Jul 22, 2023Updated 2 years ago
21335732529sky / negative_supervision
View on GitHub
The implementation of Text Classification with Negative Supervision (ACL, 2020)
☆10Oct 8, 2020Updated 5 years ago
ccj5351 / kitti-devkit
View on GitHub
kitti-devkit for generating the error maps, KITTI-color-space disparity maps, and pfm2uint16png and uint16png2pfm converting
☆12Feb 20, 2021Updated 5 years ago
Macintoshxz / books
View on GitHub
☆12Sep 25, 2021Updated 4 years ago
muggle-stack / sensevoice_cpp
View on GitHub
☆25Jun 26, 2025Updated 8 months ago
AaltoML / scalable-inference-in-sdes
View on GitHub
Methods and experiments for assumed density SDE approximations
☆12Jan 26, 2022Updated 4 years ago
chen-hao-chao / dlsm
View on GitHub
[ICLR 2022] Denoising Likelihood Score Matching for Conditional Score-based Data Generation
☆11Jan 2, 2025Updated last year
Waffle-Liu / STRODE
View on GitHub
STRODE: Stochastic Boundary Ordinary Differential Equation
☆13Jul 20, 2021Updated 4 years ago
zjhellofss / kuiperbook
View on GitHub
☆15Jun 22, 2025Updated 8 months ago
metaimagine / ImLeile
View on GitHub
👂 Typing is slow, talk to me. The project name means ' i am tired ' in Chinese (我累了). This is a AI efficiency assistant, complete your d…
☆16Jun 8, 2024Updated last year
refresh-bio / ORCOM
View on GitHub
Overlapping Reads COmpression with Minimizers
☆16May 19, 2022Updated 3 years ago
XMUDeepLIT / Translatotron-V
View on GitHub
Code for "Translatotron-V(ison): An End-to-End Model for In-Image Machine Translation" (Findings of ACL 2024)
☆16Jul 4, 2024Updated last year
hedixia / HeavyBallNODE
View on GitHub
☆13Oct 24, 2021Updated 4 years ago
mkantwala / DeepSeek-R1-TrainingSuite
View on GitHub
Advanced implementation of DeepSeek-R1 featuring Group Relative Policy Optimization (GRPO) for mathematical reasoning AI. Integrates safe…
☆13Jan 29, 2025Updated last year
neavo / KeywordGachaModel
View on GitHub
☆17Jan 31, 2025Updated last year
mbilos / tsdiff
View on GitHub
☆16May 12, 2023Updated 2 years ago
fluency03 / iplom-java
View on GitHub
IPLoM (Iterative Partitioning Log Mining) - Java
☆15Mar 13, 2016Updated 9 years ago
Wang-xjtu / SPNet
View on GitHub
[T-PAMI 2025] Scale Propagation Network for Generalizable Depth Completion
☆25Apr 1, 2025Updated 11 months ago
MLForNerds / YOLO-OBJECT-DETECTION-TUTORIALS
View on GitHub
☆14Mar 3, 2025Updated last year
RomainLITUD / UQnet-arxiv
View on GitHub
Source code for UQnet
☆16May 23, 2024Updated last year
Chongjie-Si / AdaMuon
View on GitHub
The official repository for AdaMuon
☆35Aug 27, 2025Updated 6 months ago
tdmeeste / TimeAwareRNN
View on GitHub
Code used for the AAAI 2020 paper "System Identification with Time-Aware Neural Sequence Models"
☆16Nov 22, 2019Updated 6 years ago
Archaic-Atom / JackFramework
View on GitHub
A lightweight, production-friendly orchestration layer on top of PyTorch. JackFramework standardizes model/data wiring, distributed execu…
☆17Oct 20, 2025Updated 4 months ago
T6Yang / ViLReF
View on GitHub
ViLReF: A Expert Knowledge Enabled Vision-Language Retinal Foundation Model
☆22Oct 16, 2024Updated last year
neuromorphic-paris / loris
View on GitHub
python3 library to handle files from neuromorphic cameras
☆17Jan 26, 2025Updated last year
lansinuote / Simple_TRL
View on GitHub
☆19Aug 9, 2024Updated last year
razvanc92 / ST-WA
View on GitHub
☆20Jan 19, 2022Updated 4 years ago
Wenchao-Du / GAENet
View on GitHub
This is the code for the work accepted by ICRA2022.
☆19Jun 20, 2022Updated 3 years ago
2prime / LM-ResNet
View on GitHub
Code For Beyond Finite Layer Neural Network:Bridging Deep Architects and Numerical Differential Equations
☆15Jun 4, 2019Updated 6 years ago
Annyfee / agent-craft
View on GitHub
AI Agent 教学仓库 | 系统化 LangChain、RAG、LangGraph、MCP 全栈实战代码 | 万字博客详解 | 开源可运行示例 | 从零构建智能体
☆94Feb 7, 2026Updated last month
alexzhou907 / ls4
View on GitHub
☆22Jul 24, 2023Updated 2 years ago
clear-nus / NCDSSM
View on GitHub
PyTorch implementation of the NCDSSM models presented in the ICML '23 paper "Neural Continuous-Discrete State Space Models for Irregularl…
☆25Jul 9, 2023Updated 2 years ago
dae-sun / awesome-human-pose-estimation
View on GitHub
Human Mesh Recovery / Human Pose Estimation
☆24Aug 29, 2022Updated 3 years ago
llmsystem / llmsys_code_examples
View on GitHub
☆30Feb 12, 2026Updated 3 weeks ago
yantijin / dynamic-systems-DL
View on GitHub
Collection of resources that combine dynamic systems, control with deep learning.
☆28May 18, 2021Updated 4 years ago
aiha-lab / Attention-Head-Pruning
View on GitHub
Layer-wise Pruning of Transformer Heads for Efficient Language Modeling
☆22Feb 22, 2022Updated 4 years ago
LiaoMengqi / LLM4Game24
View on GitHub
Long CoT Fine-Tuning and Reinforcement Learning for LLMs in the Context of the 24-Point Game: A Toy Project
☆25Feb 22, 2025Updated last year