hbin0701/Self-Explore

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/hbin0701/Self-Explore)

hbin0701 / Self-Explore

[𝐄𝐌𝐍𝐋𝐏 𝐅𝐢𝐧𝐝𝐢𝐧𝐠𝐬 𝟐𝟎𝟐𝟒 & 𝐀𝐂𝐋 𝟐𝟎𝟐𝟒 𝐍𝐋𝐑𝐒𝐄 𝐎𝐫𝐚𝐥] 𝘌𝘯𝘩𝘢𝘯𝘤𝘪𝘯𝘨 𝘔𝘢𝘵𝘩𝘦𝘮𝘢𝘵𝘪𝘤𝘢𝘭 𝘙𝘦𝘢𝘴𝘰𝘯𝘪𝘯𝘨 𝘪𝘯 𝘓𝘢𝘯𝘨𝘶𝘢𝘨𝘦 𝘔𝘰𝘥𝘦𝘭𝘴 𝘸𝘪𝘵𝘩 𝘍𝘪𝘯𝘦-𝘨𝘳𝘢𝘪𝘯𝘦𝘥 𝘙𝘦𝘸𝘢𝘳𝘥𝘴

☆52

Alternatives and similar repositories for Self-Explore

Users that are interested in Self-Explore are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

kaistAI / Janus
View on GitHub
[NeurIPS 2024] Train LLMs with diverse system messages reflecting individualized preferences to generalize to unseen system messages
☆53Aug 10, 2025Updated 11 months ago
ChengpengLi1003 / DotaMath
View on GitHub
☆30Dec 27, 2024Updated last year
kaistAI / InstructIR
View on GitHub
IntructIR, a novel benchmark specifically designed to evaluate the instruction following ability in information retrieval models. Our foc…
☆32Jun 13, 2024Updated 2 years ago
hanningzhang / prm
View on GitHub
☆17Nov 3, 2024Updated last year
halfrot / ALaRM
View on GitHub
[ACL 2024] Code for the paper "ALaRM: Align Language Models via Hierarchical Rewards Modeling"
☆25Mar 28, 2024Updated 2 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
kaistAI / LangBridge
View on GitHub
[ACL 2024] LangBridge: Multilingual Reasoning Without Multilingual Supervision
☆97Oct 30, 2024Updated last year
kimyuji / EvolvingQA_benchmark
View on GitHub
Code and Dataset release of "Carpe Diem: On the Evaluation of World Knowledge in Lifelong Language Models" (NAACL 2024)
☆10Oct 16, 2024Updated last year
MattYoon / reasoning-models-confidence
View on GitHub
[NeurIPS 2025] Reasoning Models Better Express Their Confidence"
☆23Nov 19, 2025Updated 8 months ago
joonkeekim / Instructive-Decoding
View on GitHub
Official repository of "Distort, Distract, Decode: Instruction-Tuned Model Can Refine its Response from Noisy Instructions", ICLR 2024 Sp…
☆21Mar 7, 2024Updated 2 years ago
FreedomIntelligence / OVM
View on GitHub
☆74Apr 2, 2024Updated 2 years ago
naver-ai / ALMoST
View on GitHub
☆24Dec 2, 2023Updated 2 years ago
RUCAIBox / FIGA
View on GitHub
[ICLR 2024] This is the official implementation for the paper: "Beyond imitation: Leveraging fine-grained quality signals for alignment"
☆10May 5, 2024Updated 2 years ago
joeljang / FLM
View on GitHub
All-in-one repository for Fine-tuning & Pretraining (Large) Language Models
☆15Mar 8, 2023Updated 3 years ago
Ablustrund / APPS_Plus
View on GitHub
StepCoder: Improve Code Generation with Reinforcement Learning from Compiler Feedback
☆73Aug 31, 2024Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
r-three / realistic_evaluation_of_model_merging_for_compositional_generalization
View on GitHub
☆12Feb 11, 2026Updated 5 months ago
morning9393 / ETPO
View on GitHub
☆14Mar 5, 2024Updated 2 years ago
thu-coai / SPaR
View on GitHub
☆47Jun 11, 2025Updated last year
etri-edgeai / nn-dist-train-poc
View on GitHub
☆17Oct 12, 2024Updated last year
MARIO-Math-Reasoning / MARIO
View on GitHub
☆28May 8, 2024Updated 2 years ago
KodCode-AI / code-r1
View on GitHub
Reproducing R1 for Code with Reliable Rewards
☆13Apr 9, 2025Updated last year
SparkJiao / dpo-trajectory-reasoning
View on GitHub
[EMNLP 2024] Source code for the paper "Learning Planning-based Reasoning with Trajectory Collection and Process Rewards Synthesizing".
☆84Jan 14, 2025Updated last year
icip-cas / SSO
View on GitHub
A scalable automated alignment method for large language models. Resources for "Aligning Large Language Models via Self-Steering Optimiza…
☆20Nov 21, 2024Updated last year
kaistAI / Volcano
View on GitHub
[NAACL 2024] Vision language model that reduces hallucinations through self-feedback guided revision. Visualizes attentions on image feat…
☆49Aug 21, 2024Updated last year
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
tangzhy / RealCritic
View on GitHub
☆15Jan 27, 2025Updated last year
mathllm / Step-Controlled_DPO
View on GitHub
☆23Jul 5, 2024Updated 2 years ago
etri-edgeai / nn-comp-llm
View on GitHub
☆17Oct 10, 2024Updated last year
sail-sg / AnytimeReasoner
View on GitHub
Optimizing Anytime Reasoning via Budget Relative Policy Optimization
☆54Jul 15, 2025Updated last year
allenai / easy-to-hard-generalization
View on GitHub
Code for the arXiv preprint "The Unreasonable Effectiveness of Easy Training Data"
☆48Jan 17, 2024Updated 2 years ago
janphilippfranken / sami
View on GitHub
Self-Supervised Alignment with Mutual Information
☆20May 24, 2024Updated 2 years ago
TIGER-AI-Lab / MAmmoTH2
View on GitHub
Official code for "MAmmoTH2: Scaling Instructions from the Web" [NeurIPS 2024]
☆146Oct 27, 2024Updated last year
CriticBench / CriticBench
View on GitHub
[ACL 2024 Findings] CriticBench: Benchmarking LLMs for Critique-Correct Reasoning
☆31Mar 5, 2024Updated 2 years ago
seonghyeonye / Flipped-Learning
View on GitHub
[ICLR 2023] Guess the Instruction! Flipped Learning Makes Language Models Stronger Zero-Shot Learners
☆117Jun 28, 2025Updated last year
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
lilakk / BLEUBERI
View on GitHub
Official repository for "BLEUBERI: BLEU is a surprisingly effective reward for instruction following"
☆32Jun 5, 2025Updated last year
joonkeekim / hare-hate-speech
View on GitHub
Official repository of "HARE: Explainable Hate Speech Detection with Step-by-Step Reasoning", Findings of EMNLP 2023
☆28Jan 25, 2024Updated 2 years ago
THUDM / Self-Contrast
View on GitHub
Extensive Self-Contrast Enables Feedback-Free Language Model Alignment
☆20Apr 2, 2024Updated 2 years ago
MARIO-Math-Reasoning / Super_MARIO
View on GitHub
☆341Jun 5, 2025Updated last year
prometheus-eval / prometheus-vision
View on GitHub
[ACL 2024 Findings & ICLR 2024 WS] An Evaluator VLM that is open-source, offers reproducible evaluation, and inexpensive to use. Specific…
☆86Sep 13, 2024Updated last year
RUCAIBox / RLMEC
View on GitHub
The official repository of "Improving Large Language Models via Fine-grained Reinforcement Learning with Minimum Editing Constraint"
☆39Jan 12, 2024Updated 2 years ago
raymin0223 / fast_robust_early_exit
View on GitHub
Fast and Robust Early-Exiting Framework for Autoregressive Language Models with Synchronized Parallel Decoding (EMNLP 2023 Long)
☆67Sep 28, 2024Updated last year