qibin0506 / llm_trainerView external linksLinks
☆50Jan 29, 2026Updated 2 weeks ago
Alternatives and similar repositories for llm_trainer
Users that are interested in llm_trainer are comparing it to the libraries listed below
Sorting:
- Implement llm model in pytorch, support MoE and RoPE☆38Jan 29, 2026Updated 2 weeks ago
- 从零构建大模型:从预训练到RLHF的完整实践☆2,392Updated this week
- Tiny-DeepSpeed, a minimalistic re-implementation of the DeepSpeed library☆50Aug 20, 2025Updated 5 months ago
- ☆22Dec 11, 2025Updated 2 months ago
- A python implementation of PROCLUS: PROjected CLUStering algorithm.☆10Jan 12, 2015Updated 11 years ago
- java implementation of Bert Tokenizer, support output onnx tensor for onnx model inference☆12Sep 4, 2023Updated 2 years ago
- Zeta implementation of a reusable and plug in and play feedforward from the paper "Exponentially Faster Language Modeling"☆16Nov 11, 2024Updated last year
- The implementation of Text Classification with Negative Supervision (ACL, 2020)☆10Oct 8, 2020Updated 5 years ago
- Multiagent optimization system (MAOS) for solving the Traveling Salesman Problem (TSP).☆12Aug 7, 2019Updated 6 years ago
- Repository for score-based transport modeling.☆11Jul 22, 2023Updated 2 years ago
- ☆23Jun 26, 2025Updated 7 months ago
- ☆10Jan 12, 2024Updated 2 years ago
- Experimental syslog template mining module☆11Aug 29, 2016Updated 9 years ago
- 实现《Multiway Attention Networks for Modeling Sentence Pairs》中的网络模型,可用于问答,句子逻辑推理☆11Apr 13, 2020Updated 5 years ago
- MeloTTS demo on Axera☆10Nov 18, 2025Updated 2 months ago
- [ICLR 2022] Denoising Likelihood Score Matching for Conditional Score-based Data Generation☆11Jan 2, 2025Updated last year
- Methods and experiments for assumed density SDE approximations☆12Jan 26, 2022Updated 4 years ago
- ☆15Jun 22, 2025Updated 7 months ago
- STRODE: Stochastic Boundary Ordinary Differential Equation☆13Jul 20, 2021Updated 4 years ago
- Taylor moment expansion in Python (JaX and SymPy) and Matlab☆11Nov 26, 2024Updated last year
- Code for "Translatotron-V(ison): An End-to-End Model for In-Image Machine Translation" (Findings of ACL 2024)☆16Jul 4, 2024Updated last year
- Overlapping Reads COmpression with Minimizers☆16May 19, 2022Updated 3 years ago
- ☆13Oct 24, 2021Updated 4 years ago
- 👂 Typing is slow, talk to me. The project name means ' i am tired ' in Chinese (我累了). This is a AI efficiency assistant, complete your d…☆16Jun 8, 2024Updated last year
- Advanced implementation of DeepSeek-R1 featuring Group Relative Policy Optimization (GRPO) for mathematical reasoning AI. Integrates safe…☆13Jan 29, 2025Updated last year
- ☆16May 12, 2023Updated 2 years ago
- ☆16Jan 31, 2025Updated last year
- The official repository for AdaMuon☆34Aug 27, 2025Updated 5 months ago
- Code used for the AAAI 2020 paper "System Identification with Time-Aware Neural Sequence Models"☆16Nov 22, 2019Updated 6 years ago
- Source code for UQnet☆16May 23, 2024Updated last year
- FastAPI Implementation of Orpheus TTS streaming Chatbot☆27Jun 19, 2025Updated 7 months ago
- [ICLR2025 Spotlight] Advantage-Guided Distillation for Preference Alignment in Small Language Models☆24Feb 10, 2025Updated last year
- ☆20Apr 17, 2023Updated 2 years ago
- [ICLR 2026 Oral] Veritas: Generalizable Deepfake Detection via Pattern-Aware Reasoning.☆30Updated this week
- ☆20Aug 11, 2021Updated 4 years ago
- ☆19Aug 9, 2024Updated last year
- ☆20Jan 19, 2022Updated 4 years ago
- Few-Shot Text Classification with Induction Network☆18Apr 23, 2020Updated 5 years ago
- ☆21Jul 24, 2023Updated 2 years ago