davendw49/llm_training_full_stack

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/davendw49/llm_training_full_stack)

davendw49 / llm_training_full_stack

📖 Full Stack Practice of the Large Language Model Training @ RLChina 2024

☆40

Alternatives and similar repositories for llm_training_full_stack

Users that are interested in llm_training_full_stack are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Acemap / pdf_parser
View on GitHub
All in one PDF Parser Toolkit
☆17Sep 15, 2023Updated 2 years ago
morning9393 / ETPO
View on GitHub
☆14Mar 5, 2024Updated 2 years ago
zepingyu0512 / arithmetic-mechanism
View on GitHub
code for EMNLP 2024 paper: Interpreting Arithmetic Mechanism in Large Language Models through Comparative Neuron Analysis
☆12Nov 17, 2024Updated last year
0xWelt / BibTeX-Formatter
View on GitHub
Format your bibtex (.bib) file to help standardize citations for conference and journal submissions
☆14Nov 23, 2025Updated 8 months ago
jaromiru / NASimEmu-agents
View on GitHub
Deep RL agents for NASimEmu. See also https://github.com/jaromiru/NASimEmu.
☆15Jul 16, 2024Updated 2 years ago
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
plm-team / PLM
View on GitHub
PLM: Efficient Peripheral Language Models Hardware-Co-Designed for Ubiquitous Computing
☆21Mar 18, 2025Updated last year
HangtingYe / UADB
View on GitHub
☆13Mar 29, 2026Updated 3 months ago
kyegomez / LM-Infinite
View on GitHub
Implementation of "LM-Infinite: Simple On-the-Fly Length Generalization for Large Language Models"
☆40Nov 11, 2024Updated last year
frt03 / jax_dt
View on GitHub
Minimal Decision Transformer Implementation written in Jax (Flax).
☆18Aug 8, 2022Updated 3 years ago
iscar-ucm / xdevs
View on GitHub
xDEVS: A cross-platform Discrete EVent System simulator
☆16Nov 14, 2025Updated 8 months ago
CN-TU / adversarial-recurrent-ids
View on GitHub
Contact: Alexander Hartl, Maximilian Bachl, Fares Meghdouri. Explainability methods and Adversarial Robustness metrics for RNNs for Intru…
☆19Mar 16, 2021Updated 5 years ago
GSYfate / knnlm-limits
View on GitHub
Official code repo for paper "Great Memory, Shallow Reasoning: Limits of kNN-LMs"
☆24Apr 30, 2025Updated last year
SeerLabs / sbdsubjectclassifier
View on GitHub
Scholarly Big Data Subject Category Classifier
☆10Jul 15, 2019Updated 7 years ago
guosyjlu / OEMA
View on GitHub
Official PyTorch code for "Sample Efficient Offline-to-Online Reinforcement Learning" in TKDE'23.
☆16Aug 14, 2023Updated 2 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
mansicer / Q-Adapter
View on GitHub
Implementation of ICLR 2025 paper "Q-Adapter: Customizing Pre-trained LLMs to New Preferences with Forgetting Mitigation"
☆18Oct 5, 2024Updated last year
Xuekai-Zhu / storytrans_public
View on GitHub
☆11Jul 24, 2023Updated 3 years ago
yanxue7 / E3T-Overcooked
View on GitHub
☆15May 4, 2024Updated 2 years ago
terenceylchow124 / Meme-MultiModal
View on GitHub
Multimodal Model for Memotion Dataset
☆12May 17, 2021Updated 5 years ago
Shaokang-Agent / Awesome-Reinforcement-Learning-Papers
View on GitHub
📚 List of Top-tier Conference Papers on Reinforcement Learning (RL)，including: NeurIPS, AAAI, IJCAI, ICML, AAMAS, ICLR, ICRA, etc. | （AI…
☆11Aug 20, 2023Updated 2 years ago
uoe-agents / PO-GPL
View on GitHub
Official code for "A General Learning Framework for Open Ad Hoc Teamwork Using Graph-based Policy Learning"
☆15Mar 1, 2023Updated 3 years ago
Hannibal046 / PlugLM
View on GitHub
[ACL2023] Source code for Decouple knowledge from paramters for plug-and-play language modeling
☆20Sep 18, 2023Updated 2 years ago
GeorgeLuImmortal / RDL-Rationales-centric-Double-robustness-Learning
View on GitHub
☆12Jun 29, 2024Updated 2 years ago
jintrone / TEvA
View on GitHub
Topic Evolution Analysis - an algorithm for analyzing knowledge flow in text based corpora
☆14Oct 16, 2016Updated 9 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
Jackory / RPBT
View on GitHub
(AAAI24 oral) Implementation of RPPO(Risk-sensitive PPO) and RPBT(Population-based self-play with RPPO)
☆12May 22, 2023Updated 3 years ago
go-aie / llmflow
View on GitHub
Orchestration engine & UI for your customized LLM flow.
☆21Apr 6, 2024Updated 2 years ago
ValentinaZangirolami / DRL
View on GitHub
Deep Recurrent Q-Network with different exploration strategies for self-driving cars (using AirSim)
☆10Sep 5, 2024Updated last year
csmile-1006 / PreferenceTransformer
View on GitHub
Preference Transformer: Modeling Human Preferences using Transformers for RL (ICLR2023 Accepted)
☆168Oct 15, 2023Updated 2 years ago
NikhilSehgal123 / coinbase-execution-algorithm
View on GitHub
An algorithm that intelligently executes a crypto order over time via Coinbase
☆13Oct 26, 2021Updated 4 years ago
vishnukanduri / Customer-Analytics-in-Python
View on GitHub
I use various Data Science and machine learning techniques to analyze customer data using STP framework. I preprocessed the data, perform…
☆11Apr 26, 2020Updated 6 years ago
grasp-lyrl / scalableMARL
View on GitHub
Multi-Agent Reinforcement Learning (MARL) method to learn scalable control polices for multi-agent target tracking (IROS22).
☆11Jul 22, 2022Updated 4 years ago
nihil21 / semg-bss
View on GitHub
Decomposition of sEMG signals via Blind Source Separation
☆14May 8, 2025Updated last year
JiaQiSJTU / FaithEval-FFLM
View on GitHub
A zero-shot faithfulness evaluation metric for text summarization
☆11Oct 17, 2023Updated 2 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
Dahoas / QDSyntheticData
View on GitHub
☆14Aug 15, 2024Updated last year
YasPHP / The-Five-Factor-Personality-Test
View on GitHub
Unsupervised learning coupled with applied factor analysis to the five-factor model (FFM), a taxonomy for personality traits used to desc…
☆16Jun 19, 2021Updated 5 years ago
Kaivalya192 / Object_Reconstruction
View on GitHub
generate depth cam footage to obj file 3d model (textured)
☆21Dec 17, 2025Updated 7 months ago
siddheshih / culture-awareness-llms
View on GitHub
☆20Nov 7, 2024Updated last year
thuml / SPOT
View on GitHub
Code release for "Supported Policy Optimization for Offline Reinforcement Learning" (NeurIPS 2022), https://arxiv.org/abs/2202.06239
☆22Jun 24, 2023Updated 3 years ago
Berbardo / MarioRL
View on GitHub
A Reinforcement Learning agent that learns to play Super Mario Bros.
☆23May 31, 2020Updated 6 years ago
Ice-Hazymoon / slidev-addon-slidepods
View on GitHub
☆27Oct 20, 2024Updated last year