Official code for our paper "Reasoning Models Hallucinate More: Factuality-Aware Reinforcement Learning for Large Reasoning Models"
☆25Oct 31, 2025Updated 7 months ago
Alternatives and similar repositories for FSPO
Users that are interested in FSPO are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Few-Shot Preference Optimization (FSPO) personalizes LLMs by reframing reward modeling as a meta-learning problem, enabling rapid adaptat…☆16Feb 27, 2025Updated last year
- LLM-Check: Investigating Detection of Hallucinations in Large Language Models (NeurIPS 2024)☆40Dec 8, 2024Updated last year
- ☆12Sep 23, 2024Updated last year
- codes for "Self-Checker: Plug-and-Play Modules for Fact-Checking with Large Language Models"☆12Feb 10, 2025Updated last year
- Train large COMET (T5-3B/GPT2-XL) with small memory (on 11GB memory GPUs like 1080/2080) using DeepSpeed.☆14Jan 23, 2022Updated 4 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- ☆50Jan 7, 2024Updated 2 years ago
- ☆15Aug 3, 2021Updated 4 years ago
- ☆12Dec 18, 2024Updated last year
- GAN paper list in text generation (2017-2020) Say it Often...☆12Jul 10, 2020Updated 5 years ago
- A Beamer Theme of UCAS for academic report, thesis and talk.☆19Oct 12, 2024Updated last year
- ☆11Oct 25, 2024Updated last year
- [CVPR 2023] "TrojViT: Trojan Insertion in Vision Transformers" by Mengxin Zheng, Qian Lou, Lei Jiang☆15Jan 5, 2024Updated 2 years ago
- Code for paper "Concrete Subspace Learning based Interference Elimination for Multi-task Model Fusion"☆14Mar 28, 2024Updated 2 years ago
- Official Implementation for "Purifying Quantization-conditioned Backdoors via Layer-wise Activation Correction with Distribution Approxim…☆12Aug 14, 2024Updated last year
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- [ECCV'24 Oral] The official GitHub page for ''Images are Achilles' Heel of Alignment: Exploiting Visual Vulnerabilities for Jailbreaking …☆38Oct 23, 2024Updated last year
- Efficient Scaling laws and collaborative pretraining.☆22Sep 18, 2025Updated 8 months ago
- IAN: An Intelligent System for Omics Data Analysis and Discovery☆10Feb 23, 2026Updated 3 months ago
- ☆15Jun 25, 2025Updated 11 months ago
- Code for paper "Towards Efficient Pareto Set Approximation via Weight-Ensembling Mixture of Experts"☆11Sep 13, 2024Updated last year
- KnowRL: Exploring Knowledgeable Reinforcement Learning for Factuality☆46May 19, 2026Updated 3 weeks ago
- [AAAI 2026] This is the official implementation of the paper "ExtendAttack: Attacking Servers of LRMs via Extending Reasoning".☆23Mar 18, 2026Updated 2 months ago
- The code implementation of MuScleLoRA (Accepted in ACL 2024)☆10Dec 1, 2024Updated last year
- ☆16Aug 14, 2022Updated 3 years ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- [NeurIPS 2025] Bag of Tricks for Inference-time Computation of LLM Reasoning☆16Sep 20, 2025Updated 8 months ago
- ☆12Sep 14, 2023Updated 2 years ago
- The official implementation of the paper "Free Fine-tuning: A Plug-and-Play Watermarking Scheme for Deep Neural Networks".☆19Apr 19, 2024Updated 2 years ago
- Procedural data generators suite for synthetic pretraining and formal reasoning☆41Updated this week
- [Re-implementation] FixMatch: Simplifying Semi-Supervised Learning with Consistency and Confidence☆15Jun 29, 2020Updated 5 years ago
- A framework for evaluating the effectiveness of chain-of-thought reasoning in language models.☆19Feb 6, 2025Updated last year
- Geometric Problem Solving Integrating FormalGeo Symbolic System and Hypergraph Neural Network.☆16Sep 23, 2025Updated 8 months ago
- The program ranked first in Audio-only track of DCASE2024 Challenge task3.☆22Mar 2, 2026Updated 3 months ago
- Implementation of KDR-Agent, the AAAI 2025 accepted paper, focusing on knowledge-driven reasoning for autonomous agents.☆21Nov 24, 2025Updated 6 months ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- ☆24Oct 30, 2025Updated 7 months ago
- Analysis on the MS-MARCO leaderboard regarding the machine reading comprehension task.☆21Dec 14, 2020Updated 5 years ago
- ☆26Jun 10, 2025Updated last year
- This repo contains visualization code of our ReplicaPano Dataset.☆18Feb 7, 2025Updated last year
- ☆22Feb 4, 2026Updated 4 months ago
- Project for SNARE benchmark☆11Jun 5, 2024Updated 2 years ago
- This is the public repository for SALSA-Lite features for polyphonic sound event localization and detection using microphone arrays.☆15Dec 3, 2021Updated 4 years ago