📖 Full Stack Practice of the Large Language Model Training @ RLChina 2024
☆40Oct 15, 2024Updated last year
Alternatives and similar repositories for llm_training_full_stack
Users that are interested in llm_training_full_stack are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- code for EMNLP 2024 paper: Interpreting Arithmetic Mechanism in Large Language Models through Comparative Neuron Analysis☆12Nov 17, 2024Updated last year
- Format your bibtex (.bib) file to help standardize citations for conference and journal submissions☆14Nov 23, 2025Updated 7 months ago
- ☆13Mar 29, 2026Updated 3 months ago
- [ICML 2025] Fast and Low-Cost Genomic Foundation Models via Outlier Removal.☆19Jun 19, 2025Updated last year
- This repository is the official implementation of Low-Rank Modular Reinforcement Learning via Muscle Synergy.☆12Oct 27, 2022Updated 3 years ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- ☆12May 14, 2024Updated 2 years ago
- A reference panel guided topological structure annotation of Hi-C data☆10Mar 23, 2023Updated 3 years ago
- Implementation of "LM-Infinite: Simple On-the-Fly Length Generalization for Large Language Models"☆40Nov 11, 2024Updated last year
- Minimal Decision Transformer Implementation written in Jax (Flax).☆18Aug 8, 2022Updated 3 years ago
- UHD is a surface electromyogram (sEMG) signals database, including original sEMG signals, the starting points and the termination points …☆12Dec 4, 2019Updated 6 years ago
- superquadrics based grasping☆13Dec 4, 2018Updated 7 years ago
- Python library for fitting massive mixture models using DP priors and GPU computation.☆23Apr 7, 2016Updated 10 years ago
- This is the repository for the Master of Science thesis titled "GAN-based Matrix Factorization for Recommender Systems".☆10Aug 10, 2020Updated 5 years ago
- Github Repo for CARL: Cautious Adaptation for RL in Safety Critical Settings☆14Nov 22, 2022Updated 3 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- xDEVS: A cross-platform Discrete EVent System simulator☆16Nov 14, 2025Updated 7 months ago
- This repository contains an implementation of the Batch-BKB algorithm as described in the ICML 2020 paper "Near-linear time Gaussian proc…☆13Jul 14, 2020Updated 5 years ago
- An algorithm that intelligently executes a crypto order over time via Coinbase☆13Oct 26, 2021Updated 4 years ago
- Official code repo for paper "Great Memory, Shallow Reasoning: Limits of kNN-LMs"☆24Apr 30, 2025Updated last year
- The frontend of ZVMS 4, powered by Element-plus, Vite, and Vue.☆12Feb 11, 2026Updated 4 months ago
- under review☆14Mar 1, 2021Updated 5 years ago
- 12th place solution for Kaggle Corporación Favorita Grocery Sales Forecasting☆15Jan 29, 2018Updated 8 years ago
- ☆15May 4, 2024Updated 2 years ago
- Implementation of ICLR 2025 paper "Q-Adapter: Customizing Pre-trained LLMs to New Preferences with Forgetting Mitigation"☆18Oct 5, 2024Updated last year
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- ☆24Aug 8, 2022Updated 3 years ago
- Code and data for Cell-o1.☆28Sep 19, 2025Updated 9 months ago
- A python implementation of Dueling Bandit Gradient Descent (DBGD)☆24Jan 23, 2019Updated 7 years ago
- [ACL2023] Source code for Decouple knowledge from paramters for plug-and-play language modeling☆20Sep 18, 2023Updated 2 years ago
- Official code for "A General Learning Framework for Open Ad Hoc Teamwork Using Graph-based Policy Learning"☆15Mar 1, 2023Updated 3 years ago
- Official implementation of “Watch Your Step: A Fine-Grained Evaluation Framework for Multi-hop Knowledge Editing in Large Language Models…☆44Nov 25, 2025Updated 7 months ago
- 🧬 Large-scale protein functional residue or fragment prediction benchmark. (ICLR 2026)☆24Apr 10, 2026Updated 2 months ago
- (AAAI24 oral) Implementation of RPPO(Risk-sensitive PPO) and RPBT(Population-based self-play with RPPO)☆12May 22, 2023Updated 3 years ago
- A zero-shot faithfulness evaluation metric for text summarization☆11Oct 17, 2023Updated 2 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Preference Transformer: Modeling Human Preferences using Transformers for RL (ICLR2023 Accepted)☆168Oct 15, 2023Updated 2 years ago
- Topic Evolution Analysis - an algorithm for analyzing knowledge flow in text based corpora☆14Oct 16, 2016Updated 9 years ago
- Deep Recurrent Q-Network with different exploration strategies for self-driving cars (using AirSim)☆10Sep 5, 2024Updated last year
- An Gym based enviroment to evaluate Multi Uav Task Alocation Algorithm☆12Feb 9, 2024Updated 2 years ago
- Multi-Agent Reinforcement Learning (MARL) method to learn scalable control polices for multi-agent target tracking (IROS22).☆11Jul 22, 2022Updated 3 years ago
- I use various Data Science and machine learning techniques to analyze customer data using STP framework. I preprocessed the data, perform…☆11Apr 26, 2020Updated 6 years ago
- Multimodal Model for Memotion Dataset☆12May 17, 2021Updated 5 years ago