Implements pre-training, supervised fine-tuning (SFT), and reinforcement learning from human feedback (RLHF), to train and fine-tune the LLaMA2 model to follow human instructions, similar to InstructGPT or ChatGPT, but on a much smaller scale.
☆57Mar 9, 2024Updated 2 years ago
Alternatives and similar repositories for InstructLLaMA
Users that are interested in InstructLLaMA are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆11Aug 10, 2024Updated last year
- The code of “Improving Weak-to-Strong Generalization with Scalable Oversight and Ensemble Learning”☆17Feb 26, 2024Updated 2 years ago
- ☆19Oct 7, 2020Updated 5 years ago
- ☆14Dec 9, 2021Updated 4 years ago
- The predecessor of CiteLab.☆18Feb 3, 2026Updated 4 months ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- R3: Robust Rubric-Agnostic Reward Models☆22Jul 12, 2025Updated 11 months ago
- [ACL 2024 Findings] Learning Fine-Grained Grounded Citations for Attributed Large Language Models☆20Oct 24, 2024Updated last year
- Language model evaluation for morality and causality☆20Nov 14, 2023Updated 2 years ago
- Toward Practical Entity Alignment Method Design: Insights from New Highly Heterogeneous Knowledge Graph Datasets☆17Feb 18, 2025Updated last year
- Code and data for the paper "Understanding Hidden Context in Preference Learning: Consequences for RLHF"☆34Dec 14, 2023Updated 2 years ago
- Long-tail Augmented Graph Contrastive Learning for Recommendation, ECML/PKDD 2023☆12Sep 22, 2023Updated 2 years ago
- [ICML 2024] "Envisioning Outlier Exposure by Large Language Models for Out-of-Distribution Detection"☆15Feb 15, 2025Updated last year
- ☆13Oct 9, 2024Updated last year
- Implementation of the CPTR model by https://arxiv.org/pdf/2101.10804.pdf☆10Mar 27, 2022Updated 4 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- PyTorch KoBART/DistilKoBART Application☆14Oct 10, 2022Updated 3 years ago
- ☆23Jan 17, 2025Updated last year
- Uncertainty-Aware Curriculum Learning for Neural Machine Translation (ACL 2020)☆11Jun 12, 2020Updated 6 years ago
- ☆12May 14, 2024Updated 2 years ago
- The code of "Deep Regression Representation Learning with Topology" in ICML 2024☆14Jul 4, 2024Updated last year
- Source code for Interpretable Reward Redistribution in Reinforcement Learning: A Causal Approach (NeurIPS 2023)☆10Dec 12, 2023Updated 2 years ago
- Implementation of Reinforcement Learning from Human Feedback (RLHF)☆173Apr 7, 2023Updated 3 years ago
- Code for RECENT☆13Dec 18, 2022Updated 3 years ago
- Yet another dynamic batch sampler for variable sequence data in PyTorch.☆13Dec 9, 2021Updated 4 years ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- ☆13Sep 20, 2020Updated 5 years ago
- Source code for ICDE 2020 paper Collective Entity Alignment via Adaptive Features (CEA).☆16Jun 10, 2020Updated 6 years ago
- The official implementation of EMNLP 2021 paper "#HowYouTagTweets: Learning User Hashtagging Preferences via Personalized Topic Attention…☆11Feb 21, 2023Updated 3 years ago
- ☆47Apr 8, 2022Updated 4 years ago
- An algorithm that intelligently executes a crypto order over time via Coinbase☆13Oct 26, 2021Updated 4 years ago
- VeighNa框架的万得Wind数据服务接口☆20Jun 11, 2025Updated last year
- D3PE (Deep Data-Driven Policy Evaluation) aims to evaluation a large set of candidate policies from a fixed dataset to select best ones.☆10Jun 2, 2022Updated 4 years ago
- Code for Predictive Engagement: An Efficient Metric for Automatic Evaluation of Open-Domain Dialogue Systems☆16Jun 8, 2021Updated 5 years ago
- ☆19Oct 24, 2023Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Dataset2024☆12Jun 12, 2025Updated last year
- Implementation of ICLR 2025 paper "Q-Adapter: Customizing Pre-trained LLMs to New Preferences with Forgetting Mitigation"☆18Oct 5, 2024Updated last year
- 【ICME2025 Oral】Offical Pytorch Code for "Fraesormer: Learning Adaptive Sparse Transformer for Efficient Food Recognition"☆11Mar 21, 2025Updated last year
- Subliminal learning in LLMs: language models can transmit hidden preferences through seemingly unrelated training data.☆24Nov 9, 2025Updated 7 months ago
- Code for [NeurIPS'2019 Spotlight] Policy Continuation with Hindsight Inverse Dynamics☆15Jan 7, 2020Updated 6 years ago
- [ACL 2024] ANAH & [NeurIPS 2024] ANAH-v2 & [ICLR 2025] Mask-DPO☆65Apr 30, 2025Updated last year
- Pytorch code for paper QA-LoRA: Quantization-Aware Low-Rank Adaptation of Large Language Models☆25Sep 27, 2023Updated 2 years ago