realize the reinforcement learning training for gpt2 llama bloom and so on llm model
☆27Sep 19, 2023Updated 2 years ago
Alternatives and similar repositories for llm_rlhf
Users that are interested in llm_rlhf are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- chatglm_rlhf_finetuning☆30Oct 10, 2023Updated 2 years ago
- Large language Model fintuning bloom , opt , gpt, gpt2 ,llama,llama-2,cpmant and so on☆101Apr 24, 2024Updated last year
- ☆11May 11, 2022Updated 3 years ago
- Repo - Paper "Capturing Semantics for Imputation with Pre-trained Language Models." [ICDE 2021]☆10Mar 13, 2022Updated 4 years ago
- share data, prompt data , pretraining data☆36Nov 30, 2023Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Code and data for the VLDB 2023 paper: RECA: Related Tables Enhanced Column Semantic Type Annotation Framework☆12May 7, 2025Updated 11 months ago
- The source code of the Sudowoodo paper in ICDE 2023☆18May 24, 2023Updated 2 years ago
- Code for the paper "Rotom: A Meta-Learned Data Augmentation Framework for Entity Matching, Data Cleaning, Text Classification, and Beyond…☆24May 31, 2022Updated 3 years ago
- Fun LLM Agent Projects I Designed & Built☆58Jan 3, 2026Updated 3 months ago
- Resources for PVLDB 2023 submission☆27Aug 28, 2024Updated last year
- ☆15Jul 24, 2018Updated 7 years ago
- AdaLoGN: Adaptive Logic Graph Network for Reasoning-Based Machine Reading Comprehension (ACL 2022)☆27May 20, 2022Updated 3 years ago
- ☆13Apr 10, 2025Updated last year
- aigc_serving lightweight and efficient Language service model reasoning☆24Jun 12, 2024Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- LangChain Agent☆11Nov 25, 2025Updated 4 months ago
- [TPAMI] "Symbolic Visual Reinforcement Learning: A Scalable Framework with Object-Level Abstraction and Differentiable Expression Search"…☆17Jan 4, 2023Updated 3 years ago
- ☆21May 28, 2025Updated 10 months ago
- Natural Language Processing Toolkit for Neuroscience☆27Dec 4, 2024Updated last year
- deep learning☆150May 6, 2025Updated 11 months ago
- Math24o: 高中奥林匹克数学竞赛测评集 High School Olympiad Mathematics Chinese Benchmark☆11Mar 27, 2025Updated last year
- 对ChatGLM直接使用RLHF提升或降低目标输出概率|Modify ChatGLM output with only RLHF☆198May 23, 2023Updated 2 years ago
- (1)弹性区间标准化的旋转位置词嵌入编码器+peft LORA量化训练,提高万级tokens性能支持。(2)证据理论解释学习,提升模型的复杂逻辑推理能力(3)兼容alpaca数据格式。☆45Jul 19, 2023Updated 2 years ago
- Implementation of "The Power of Scale for Parameter-Efficient Prompt Tuning"☆58Jun 27, 2022Updated 3 years ago
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- ☆18Jan 31, 2023Updated 3 years ago
- Analysis code for knowledge discovery project☆12Sep 25, 2018Updated 7 years ago
- LLaMa Tuning with Stanford Alpaca Dataset using Deepspeed and Transformers☆49Mar 15, 2023Updated 3 years ago
- [ICLR 2024] Adaptive Replay Ratio implementation from 'Revisiting Plasticity in Visual RL: Data, Modules and Training Stages'.☆13Oct 9, 2024Updated last year
- A simple PyTorch implementation of CLIP model using DinoV2 and BERT☆15Sep 26, 2023Updated 2 years ago
- This repo is for residual-connected sentence encoder for NLI.☆11Jan 21, 2018Updated 8 years ago
- chinese few-shot ner☆16Aug 28, 2022Updated 3 years ago
- aigc evals☆10Dec 2, 2023Updated 2 years ago
- Dynamic Mixture of Progressive Parameter-Efficient Expert Library for Lifelong Robot Learning☆27Jul 4, 2025Updated 9 months ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- ☆15Sep 20, 2018Updated 7 years ago
- A literature review for constructing and using knowledge graphs in a biomedical setting.☆11May 22, 2020Updated 5 years ago
- DeepDive Biomedical Tools☆15Apr 3, 2017Updated 9 years ago
- Official repository for ALT (ALignment with Textual feedback).☆10Jul 25, 2024Updated last year
- ☆13Dec 4, 2017Updated 8 years ago
- Code for the paper "Abstractive Summarization Guided by Latent Hierarchical Document Structure"☆13May 20, 2023Updated 2 years ago
- PyTorch Implementation of the Sequential Multiagent Rollout algorithm☆11Jun 28, 2024Updated last year