📖 Full Stack Practice of the Large Language Model Training @ RLChina 2024
☆40Oct 15, 2024Updated last year
Alternatives and similar repositories for llm_training_full_stack
Users that are interested in llm_training_full_stack are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆31May 15, 2026Updated 3 weeks ago
- All in one PDF Parser Toolkit☆17Sep 15, 2023Updated 2 years ago
- Work in progress LLM framework.☆16Oct 31, 2024Updated last year
- ☆14Mar 5, 2024Updated 2 years ago
- implementation of the paper "APRIL: Towards Scalable and Transferable Autonomous Penetration Testing in Large Action Space via Action Emb…☆12Dec 24, 2024Updated last year
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- code for EMNLP 2024 paper: Interpreting Arithmetic Mechanism in Large Language Models through Comparative Neuron Analysis☆12Nov 17, 2024Updated last year
- A Reinforcement Learning agent that learns to play Super Mario Bros.☆23May 31, 2020Updated 6 years ago
- Format your bibtex (.bib) file to help standardize citations for conference and journal submissions☆14Nov 23, 2025Updated 6 months ago
- Deep RL agents for NASimEmu. See also https://github.com/jaromiru/NASimEmu.☆15Jul 16, 2024Updated last year
- The code of "Deep Regression Representation Learning with Topology" in ICML 2024☆14Jul 4, 2024Updated last year
- A project to automatically generate program repair recommendation in the field of smart contracts for given code snippets with their cont…☆16Aug 30, 2025Updated 9 months ago
- ☆12Feb 23, 2023Updated 3 years ago
- ☆16Dec 29, 2022Updated 3 years ago
- Minimal Decision Transformer Implementation written in Jax (Flax).☆18Aug 8, 2022Updated 3 years ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- This repository contains an implementation of an anomaly detection method called DPLAN, which is based on the reinforcement learning fram…☆13Jan 8, 2024Updated 2 years ago
- ☆20Nov 7, 2024Updated last year
- A blockchain that organically orchestrates Proof of Work (PoW) and AI Training computations together☆14Nov 30, 2022Updated 3 years ago
- superquadrics based grasping☆13Dec 4, 2018Updated 7 years ago
- xDEVS: A cross-platform Discrete EVent System simulator☆16Nov 14, 2025Updated 6 months ago
- ☆11Oct 12, 2023Updated 2 years ago
- Example of an autoencoder set up for spectrograms, using Theano and Lasagne☆12Jan 13, 2016Updated 10 years ago
- Contact: Alexander Hartl, Maximilian Bachl, Fares Meghdouri. Explainability methods and Adversarial Robustness metrics for RNNs for Intru…☆19Mar 16, 2021Updated 5 years ago
- An algorithm that intelligently executes a crypto order over time via Coinbase☆13Oct 26, 2021Updated 4 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- ☆19Dec 17, 2023Updated 2 years ago
- Official PyTorch code for "Sample Efficient Offline-to-Online Reinforcement Learning" in TKDE'23.☆16Aug 14, 2023Updated 2 years ago
- under review☆14Mar 1, 2021Updated 5 years ago
- 12th place solution for Kaggle Corporación Favorita Grocery Sales Forecasting☆15Jan 29, 2018Updated 8 years ago
- ☆15May 4, 2024Updated 2 years ago
- Implementation of ICLR 2025 paper "Q-Adapter: Customizing Pre-trained LLMs to New Preferences with Forgetting Mitigation"☆18Oct 5, 2024Updated last year
- ☆24Aug 8, 2022Updated 3 years ago
- A Python module for mapping multiple high-dimensional datasets into a common low-dimensional space.☆10Mar 29, 2018Updated 8 years ago
- A python implementation of Dueling Bandit Gradient Descent (DBGD)☆24Jan 23, 2019Updated 7 years ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- [ACL2023] Source code for Decouple knowledge from paramters for plug-and-play language modeling☆20Sep 18, 2023Updated 2 years ago
- ☆12Jun 29, 2024Updated last year
- Official code for "A General Learning Framework for Open Ad Hoc Teamwork Using Graph-based Policy Learning"☆17Mar 1, 2023Updated 3 years ago
- Implementation of "Deep reinforcement learning for imbalanced classification" and its extended version to multi-class☆16Sep 30, 2021Updated 4 years ago
- Official implementation of “Watch Your Step: A Fine-Grained Evaluation Framework for Multi-hop Knowledge Editing in Large Language Models…☆45Nov 25, 2025Updated 6 months ago
- (AAAI24 oral) Implementation of RPPO(Risk-sensitive PPO) and RPBT(Population-based self-play with RPPO)☆12May 22, 2023Updated 3 years ago
- Topic Evolution Analysis - an algorithm for analyzing knowledge flow in text based corpora☆14Oct 16, 2016Updated 9 years ago