zwhong714 / PSFTView external linksLinks
[ICLR 2026] PSFT is a trust-region–inspired fine-tuning objective that views SFT as a policy gradient method with constant advantages, constraining policy drift to stabilize training and improve generalization.
☆35Sep 9, 2025Updated 5 months ago
Alternatives and similar repositories for PSFT
Users that are interested in PSFT are comparing it to the libraries listed below
Sorting:
- ☆27Jul 18, 2025Updated 6 months ago
- The official repository of NeurIPS'25 paper "Ada-R1: From Long-Cot to Hybrid-CoT via Bi-Level Adaptive Reasoning Optimization"☆21Nov 9, 2025Updated 3 months ago
- NeurIPS'24 - LLM Safety Landscape☆39Oct 21, 2025Updated 3 months ago
- Code for the paper "Distinguishing the Knowable from the Unknowable with Language Models"☆11Apr 15, 2024Updated last year
- This is the official code repository for the paper: Towards General Continuous Memory for Vision-Language Models.☆19Jul 3, 2025Updated 7 months ago
- Generate Quiz Question from PDF/Text files☆11Feb 2, 2024Updated 2 years ago
- Test-Time Label-Shift Adaptation☆13May 24, 2023Updated 2 years ago
- [ICCV 2025 DeepID Challenge] Official 1st Place in both tracks (Detection & Localization)☆17Dec 24, 2025Updated last month
- Compute training dynamics, plot data cartography, analysing data quality...☆42Nov 10, 2022Updated 3 years ago
- ☆10Sep 18, 2021Updated 4 years ago
- Official implementation for Text Generation Beyond Discrete Token Sampling☆21Aug 11, 2025Updated 6 months ago
- ☆20Jul 23, 2025Updated 6 months ago
- a script from ERNIE1.0 or ERNIE2.0 to transfomers' BERT format☆10Mar 28, 2020Updated 5 years ago
- LongAttn :Selecting Long-context Training Data via Token-level Attention☆15Jul 16, 2025Updated 6 months ago
- ☆26Oct 16, 2025Updated 3 months ago
- ☆12Apr 25, 2025Updated 9 months ago
- Python bindings for NVIDIA CUDA APIs.☆13Mar 2, 2024Updated last year
- LLM-Powered Open-Vocabulary Scene Segmentation with Language Embedded 3D Gaussians☆22Jan 10, 2025Updated last year
- Code for reproducing our paper: LMSOC: An Approach for Socially Sensitive Pretraining☆13Oct 22, 2021Updated 4 years ago
- ☆17May 3, 2025Updated 9 months ago
- Code accompanying the 2022 DLS paper "Misleading Deep-Fake Detection with GAN Fingerprints"☆10May 26, 2022Updated 3 years ago
- ☆12Sep 15, 2025Updated 4 months ago
- ☆10Nov 1, 2019Updated 6 years ago
- An app that autofills when2meet based on your google calendar☆10May 22, 2023Updated 2 years ago
- Utility functions for weights and biases (wandb).