📖 Full Stack Practice of the Large Language Model Training @ RLChina 2024
☆40Oct 15, 2024Updated last year
Alternatives and similar repositories for llm_training_full_stack
Users that are interested in llm_training_full_stack are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- All in one PDF Parser Toolkit☆17Sep 15, 2023Updated 2 years ago
- PLM: Efficient Peripheral Language Models Hardware-Co-Designed for Ubiquitous Computing☆20Mar 18, 2025Updated last year
- ☆14Mar 5, 2024Updated 2 years ago
- code for EMNLP 2024 paper: Interpreting Arithmetic Mechanism in Large Language Models through Comparative Neuron Analysis☆12Nov 17, 2024Updated last year
- [AAAI 2021]Knowledge-Driven Distractor Generation for Cloze-Style Multiple Choice Questions☆22Jul 29, 2021Updated 4 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Format your bibtex (.bib) file to help standardize citations for conference and journal submissions☆14Nov 23, 2025Updated 4 months ago
- Deep RL agents for NASimEmu. See also https://github.com/jaromiru/NASimEmu.☆15Jul 16, 2024Updated last year
- ☆13Nov 4, 2025Updated 4 months ago
- ☆12May 14, 2024Updated last year
- A project to automatically generate program repair recommendation in the field of smart contracts for given code snippets with their cont…☆16Aug 30, 2025Updated 6 months ago
- Unofficial faiss wheel builder for NVIDIA GPU☆34Mar 8, 2026Updated 2 weeks ago
- Minimal Decision Transformer Implementation written in Jax (Flax).☆17Aug 8, 2022Updated 3 years ago
- ☆15Sep 25, 2021Updated 4 years ago
- This repository contains an implementation of an anomaly detection method called DPLAN, which is based on the reinforcement learning fram…☆12Jan 8, 2024Updated 2 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- [ICLR 2025] Large (Vision) Language Models are Unsupervised In-Context Learners☆22Jun 6, 2025Updated 9 months ago
- This is the repository for the Master of Science thesis titled "GAN-based Matrix Factorization for Recommender Systems".☆10Aug 10, 2020Updated 5 years ago
- Github Repo for CARL: Cautious Adaptation for RL in Safety Critical Settings☆14Nov 22, 2022Updated 3 years ago
- Contact: Alexander Hartl, Maximilian Bachl, Fares Meghdouri. Explainability methods and Adversarial Robustness metrics for RNNs for Intru…☆19Mar 16, 2021Updated 5 years ago
- VeighNa框架的万得Wind数据服务接口☆18Jun 11, 2025Updated 9 months ago
- An algorithm that intelligently executes a crypto order over time via Coinbase☆13Oct 26, 2021Updated 4 years ago
- Code for COLING 2020 paper "Controllable Abstractive Sentence Summarization with Guiding Entities"☆12Dec 24, 2020Updated 5 years ago
- Official PyTorch code for "Sample Efficient Offline-to-Online Reinforcement Learning" in TKDE'23.☆16Aug 14, 2023Updated 2 years ago
- under review☆14Mar 1, 2021Updated 5 years ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- ☆14May 4, 2024Updated last year
- Implementation of ICLR 2025 paper "Q-Adapter: Customizing Pre-trained LLMs to New Preferences with Forgetting Mitigation"☆18Oct 5, 2024Updated last year
- ☆24Aug 8, 2022Updated 3 years ago
- [ACL2023] Source code for Decouple knowledge from paramters for plug-and-play language modeling☆20Sep 18, 2023Updated 2 years ago
- ☆12Jun 29, 2024Updated last year
- Implementation of "Deep reinforcement learning for imbalanced classification" and its extended version to multi-class☆17Sep 30, 2021Updated 4 years ago
- (AAAI24 oral) Implementation of RPPO(Risk-sensitive PPO) and RPBT(Population-based self-play with RPPO)☆12May 22, 2023Updated 2 years ago
- A zero-shot faithfulness evaluation metric for text summarization☆11Oct 17, 2023Updated 2 years ago
- Preference Transformer: Modeling Human Preferences using Transformers for RL (ICLR2023 Accepted)☆167Oct 15, 2023Updated 2 years ago
- NordVPN Threat Protection Pro™ • AdTake your cybersecurity to the next level. Block phishing, malware, trackers, and ads. Lightweight app that works with all browsers.
- Topic Evolution Analysis - an algorithm for analyzing knowledge flow in text based corpora☆14Oct 16, 2016Updated 9 years ago
- An Gym based enviroment to evaluate Multi Uav Task Alocation Algorithm☆13Feb 9, 2024Updated 2 years ago
- ☆14Aug 15, 2024Updated last year
- AttentionDTA: prediction of drug–target binding affinity using attention model.https://ieeexplore.ieee.org/abstract/document/8983125☆13Aug 29, 2020Updated 5 years ago
- Multi-Agent Reinforcement Learning (MARL) method to learn scalable control polices for multi-agent target tracking (IROS22).☆11Jul 22, 2022Updated 3 years ago
- Unsupervised learning coupled with applied factor analysis to the five-factor model (FFM), a taxonomy for personality traits used to desc…☆16Jun 19, 2021Updated 4 years ago
- Code release for "Supported Policy Optimization for Offline Reinforcement Learning" (NeurIPS 2022), https://arxiv.org/abs/2202.06239☆22Jun 24, 2023Updated 2 years ago