qibin0506 / llm_trainerView external linksLinks
☆50Jan 29, 2026Updated 2 weeks ago
Alternatives and similar repositories for llm_trainer
Users that are interested in llm_trainer are comparing it to the libraries listed below
Sorting:
- Implement llm model in pytorch, support MoE and RoPE☆39Jan 29, 2026Updated 2 weeks ago
- 从零构建大模型:从预训练到RLHF的完整实践☆2,392Updated this week
- Tiny-DeepSpeed, a minimalistic re-implementation of the DeepSpeed library☆50Aug 20, 2025Updated 5 months ago
- ☆22Dec 11, 2025Updated 2 months ago
- A python implementation of PROCLUS: PROjected CLUStering algorithm.☆10Jan 12, 2015Updated 11 years ago
- Repository for score-based transport modeling.☆11Jul 22, 2023Updated 2 years ago
- ☆23Jun 26, 2025Updated 7 months ago
- Zeta implementation of a reusable and plug in and play feedforward from the paper "Exponentially Faster Language Modeling"☆16Nov 11, 2024Updated last year
- Multiagent optimization system (MAOS) for solving the Traveling Salesman Problem (TSP).☆12Aug 7, 2019Updated 6 years ago
- java implementation of Bert Tokenizer, support output onnx tensor for onnx model inference☆12Sep 4, 2023Updated 2 years ago
- 在您的机器上本地离线运行 AI 模型☆11May 8, 2025Updated 9 months ago
- ☆10Jan 12, 2024Updated 2 years ago
- 实现《Multiway Attention Networks for Modeling Sentence Pairs》中的网络模型,可用于问答,句子逻辑推理☆11Apr 13, 2020Updated 5 years ago
- ☆12Sep 25, 2021Updated 4 years ago
- MeloTTS demo on Axera☆10Nov 18, 2025Updated 2 months ago
- STRODE: Stochastic Boundary Ordinary Differential Equation☆13Jul 20, 2021Updated 4 years ago
- 使用Sentencepiece对中文语料进行分词☆13Nov 30, 2023Updated 2 years ago
- Overlapping Reads COmpression with Minimizers☆16May 19, 2022Updated 3 years ago
- ☆13Oct 24, 2021Updated 4 years ago
- IPLoM (Iterative Partitioning Log Mining) - Java☆15Mar 13, 2016Updated 9 years ago
- ☆16Jan 31, 2025Updated last year
- The official repository for AdaMuon☆34Aug 27, 2025Updated 5 months ago
- Source code for UQnet☆16May 23, 2024Updated last year
- Code used for the AAAI 2020 paper "System Identification with Time-Aware Neural Sequence Models"☆16Nov 22, 2019Updated 6 years ago
- ViLReF: A Expert Knowledge Enabled Vision-Language Retinal Foundation Model☆22Oct 16, 2024Updated last year
- Using BERT for long sentence classification (more than 512 word pieces).☆17May 9, 2021Updated 4 years ago
- ☆20Apr 17, 2023Updated 2 years ago
- [ICLR2025 Spotlight] Advantage-Guided Distillation for Preference Alignment in Small Language Models☆24Feb 10, 2025Updated last year
- Implementation of MTAD-TF: Multivariate Time Series Anomaly Detection Using the Combination of Temporal Pattern and Feature Pattern☆16Feb 21, 2021Updated 4 years ago
- [ICLR 2026 Oral] Veritas: Generalizable Deepfake Detection via Pattern-Aware Reasoning.☆30Updated this week
- FastAPI Implementation of Orpheus TTS streaming Chatbot☆27Jun 19, 2025Updated 7 months ago
- ☆20Jan 19, 2022Updated 4 years ago
- ☆19Aug 9, 2024Updated last year
- Code For Beyond Finite Layer Neural Network:Bridging Deep Architects and Numerical Differential Equations☆15Jun 4, 2019Updated 6 years ago
- ☆21Jul 24, 2023Updated 2 years ago
- PyTorch implementation of the NCDSSM models presented in the ICML '23 paper "Neural Continuous-Discrete State Space Models for Irregularl…☆25Jul 9, 2023Updated 2 years ago
- Collection of resources that combine dynamic systems, control with deep learning.☆29May 18, 2021Updated 4 years ago
- Layer-wise Pruning of Transformer Heads for Efficient Language Modeling☆22Feb 22, 2022Updated 3 years ago
- Fork with streaming inference support + ~6× faster inference☆93Updated this week