realize the reinforcement learning training for gpt2 llama bloom and so on llm model
☆27Sep 19, 2023Updated 2 years ago
Alternatives and similar repositories for llm_rlhf
Users that are interested in llm_rlhf are comparing it to the libraries listed below
Sorting:
- Repo - Paper "Capturing Semantics for Imputation with Pre-trained Language Models." [ICDE 2021]☆10Mar 13, 2022Updated 3 years ago
- Code and data for the VLDB 2023 paper: RECA: Related Tables Enhanced Column Semantic Type Annotation Framework☆12May 7, 2025Updated 9 months ago
- ☆11May 11, 2022Updated 3 years ago
- chatglm_rlhf_finetuning☆30Oct 10, 2023Updated 2 years ago
- ☆16May 31, 2024Updated last year
- Large language Model fintuning bloom , opt , gpt, gpt2 ,llama,llama-2,cpmant and so on☆99Apr 24, 2024Updated last year
- The source code of the Sudowoodo paper in ICDE 2023☆18May 24, 2023Updated 2 years ago
- ☆15Jul 24, 2018Updated 7 years ago
- Reinforcement learning (RL) is an effective method to find reasoning pathways in incomplete knowledge graphs (KGs). To overcome the chall…☆23Oct 13, 2024Updated last year
- (1)弹性区间标准化的旋转位置词嵌入编码器+peft LORA量化训练,提高万级tokens性能支持。(2)证据理论解释学习,提升模型的复杂逻辑推理能力(3)兼容alpaca数据格式。☆45Jul 19, 2023Updated 2 years ago
- This repo contains code for paper: "Uncertainty Estimation and Quantification for LLMs: A Simple Supervised Approach".☆24Oct 21, 2024Updated last year
- 大语言模型微调的项目,包含了使用QLora微调ChatGLM和LLama☆28Jun 26, 2023Updated 2 years ago
- ☆12Sep 25, 2023Updated 2 years ago
- Math24o: 高中奥林匹克数学竞赛测评集 High School Olympiad Mathematics Chinese Benchmark☆11Mar 27, 2025Updated 11 months ago
- 这里将paddle中的ocr等模型转为onnx格式,并利用java版深度框架djl加载这些onnx模型进行推理预测尝试。☆13Nov 15, 2022Updated 3 years ago
- This repo is for residual-connected sentence encoder for NLI.☆11Jan 21, 2018Updated 8 years ago
- A Challenge on Dialog Systems with Retrieval Augmented Generation (FutureDial-RAG), Co-located with SLT2024 FutureDial-RAG Challenge☆11Aug 10, 2024Updated last year
- ChatRPC is a framework that allows large language models to interact with external services.☆10Dec 25, 2023Updated 2 years ago
- AI 应用服务平台☆28Nov 12, 2025Updated 3 months ago
- c++高性能内存池☆11May 10, 2021Updated 4 years ago
- A third-party implementation of paper《SpellGCN: Incorporating Phonological and Visual Similarities into Language Models for Chinese Spell…☆14Nov 27, 2020Updated 5 years ago
- ☆11Jun 4, 2022Updated 3 years ago
- ☆28Jan 5, 2026Updated 2 months ago
- ☆19May 28, 2025Updated 9 months ago
- The repo for paper: Exploiting the Index Gradients for Optimization-Based Jailbreaking on Large Language Models.☆13Dec 16, 2024Updated last year
- Templates and examples for ACL and EMNLP conference posters.☆14Oct 5, 2024Updated last year
- ☆16Apr 28, 2023Updated 2 years ago
- A work-in-progress framework for building multi-platform, multi-user VR experiences for social learning spaces.☆20Apr 3, 2025Updated 11 months ago
- The official implement of paper S2-VER: Semi-Supervised Visual Emotion Recognition☆11Apr 28, 2024Updated last year
- Long Context Research☆29Jan 26, 2026Updated last month
- HealthFC: Verifying Health Claims with Evidence-Based Medical Fact-Checking☆12Apr 11, 2025Updated 10 months ago
- 常用开源软件(Jaeger,grafana,consul,prometheus,nginx-ingress-controller)及常用资源(deployment,svc,ingress...) K8s部署Yaml合集☆12Jun 27, 2020Updated 5 years ago
- API documentation for BlueBrain projects:☆12Dec 1, 2021Updated 4 years ago
- ☆12Sep 23, 2024Updated last year
- ☆12Mar 21, 2019Updated 6 years ago
- ☆11Nov 14, 2020Updated 5 years ago
- Converts Quora's new NLU dataset to SNLI txt/jsonl format, plus test/dev split, tokenization.☆14Jan 27, 2017Updated 9 years ago
- Transfer Learning on Dogs vs Cats dataset using PyTorch C+ API☆12Aug 16, 2019Updated 6 years ago
- **ASCM4ABSA** - Our code and proposed data for NLPCC 2022 paper titled "Aspect-specific Context Modeling for Aspect-based Sentiment Analy…☆12Mar 26, 2023Updated 2 years ago