☆64May 8, 2026Updated this week
Alternatives and similar repositories for llm_trainer
Users that are interested in llm_trainer are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Implement llm model in pytorch, support MoE and RoPE☆67Apr 25, 2026Updated 2 weeks ago
- 从零构建大模型:从预训练到RLHF的完整实践☆2,639Mar 19, 2026Updated last month
- Official implementation for the paper "Quantum Bayesian Optimization" accepted to NeurIPS 2023.☆12Jan 7, 2024Updated 2 years ago
- Official implementation for the paper "Sample-Then-Optimize Batch Neural Thompson Sampling", published at NeurIPS 2022.☆12Oct 13, 2022Updated 3 years ago
- Zeta implementation of a reusable and plug in and play feedforward from the paper "Exponentially Faster Language Modeling"☆16Nov 11, 2024Updated last year
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- PPO in one file☆27Oct 26, 2024Updated last year
- Tiny-DeepSpeed, a minimalistic re-implementation of the DeepSpeed library☆52Aug 20, 2025Updated 8 months ago
- A distributed in-memory store for temporal knowledge graphs☆10Mar 20, 2024Updated 2 years ago
- Natural Language to Overpass Query Language☆30Mar 14, 2024Updated 2 years ago
- A python implementation of PROCLUS: PROjected CLUStering algorithm.☆10Jan 12, 2015Updated 11 years ago
- A Sample Code Project for ASP.NET 5 with Dapr☆13Apr 18, 2021Updated 5 years ago
- ☆27Dec 11, 2025Updated 4 months ago
- MeloTTS demo on Axera☆12Nov 18, 2025Updated 5 months ago
- Reproduced the DFT method without using Verl. https://arxiv.org/abs/2508.05629☆23Oct 14, 2025Updated 6 months ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- java implementation of Bert Tokenizer, support output onnx tensor for onnx model inference☆13Sep 4, 2023Updated 2 years ago
- 实现《Multiway Attention Networks for Modeling Sentence Pairs》中的网络模型,可用于问答,句子逻辑推理☆11Apr 13, 2020Updated 6 years ago
- ☆15Apr 23, 2026Updated 2 weeks ago
- ☆25Mar 8, 2026Updated 2 months ago
- LLM KV Cache compression - K+V dual compression, 73-99% VRAM savings, zero accuracy loss☆51Mar 30, 2026Updated last month
- The implementation of Text Classification with Negative Supervision (ACL, 2020)☆10Oct 8, 2020Updated 5 years ago
- ☆10Jan 12, 2024Updated 2 years ago
- ☆13Sep 25, 2021Updated 4 years ago
- MetaSearch:llm深度研究(deepsearch)功能方案实现☆33Aug 21, 2025Updated 8 months ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- 使用Sentencepiece对中文语料进行分词☆13Nov 30, 2023Updated 2 years ago
- 在您的机器上本地离线运行 AI 模型☆11May 8, 2025Updated last year
- 👂 Typing is slow, talk to me. The project name means ' i am tired ' in Chinese (我累了). This is a AI efficiency assistant, complete your d…☆16Jun 8, 2024Updated last year
- Experimental syslog template mining module☆11Aug 29, 2016Updated 9 years ago
- ☆17Jan 31, 2025Updated last year
- Methods and experiments for assumed density SDE approximations☆12Jan 26, 2022Updated 4 years ago
- Taylor moment expansion in Python (JaX and SymPy) and Matlab☆11Nov 26, 2024Updated last year
- The baseline system for the ICASSP2024 ICMC-ASR Challenge.☆56Dec 6, 2023Updated 2 years ago
- a distributed locker based on zookeeper and implemented in golang.☆15Aug 27, 2016Updated 9 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- STRODE: Stochastic Boundary Ordinary Differential Equation☆13Jul 20, 2021Updated 4 years ago
- ☆13Oct 24, 2021Updated 4 years ago
- Models and code for the ICLR 2020 workshop paper "Towards Understanding Normalization in Neural ODEs"☆16Apr 27, 2020Updated 6 years ago
- ☆20Apr 17, 2023Updated 3 years ago
- This is the official implementation for the paper: Use Your INSTINCT: INSTruction optimization usIng Neural bandits Coupled with Transfor…☆53Jun 9, 2024Updated last year
- The official repository for AdaMuon☆38Aug 27, 2025Updated 8 months ago
- A lightweight Transformer training & inference framework☆47Updated this week