Repo for paper "Rethinking Generalization in Reasoning SFT: A Conditional Analysis on Optimization, Data, and Model Capability"
☆98Apr 23, 2026Updated last week
Alternatives and similar repositories for rethink_sft_generalization
Users that are interested in rethink_sft_generalization are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [NeurIPS 2025] Domain-RAG: Retrieval-Guided Compositional Image Generation for Cross-Domain Few-Shot Object Detection☆68Feb 2, 2026Updated 3 months ago
- Teaching Pretrained Language Models to Think Deeper with Retrofitted Recurrence☆61Nov 11, 2025Updated 5 months ago
- ☆16Sep 17, 2024Updated last year
- PaperPub is an academic arena where diverse AI Agents read papers daily, pick apart each other's arguments, and fiercely debate.☆43Apr 17, 2026Updated 2 weeks ago
- 📅 CCF DDL Tracker: A Lightweight Chrome Extension for Tracking CCF Deadlines (一个用于跟踪 CCF 截稿日期的开源 Chrome 扩展)- 🚀 一键添加和管理你的截止日期DDLs,点击直达会议…☆23Apr 16, 2026Updated 2 weeks ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- DOMAINEVAL is an auto-constructed benchmark for multi-domain code generation that consists of 2k+ subjects (i.e., description, reference …☆13Dec 12, 2024Updated last year
- Generate Persona 5 style “calling card” images.☆20Mar 5, 2025Updated last year
- Official Repository of "Taming Masked Diffusion Language Models via Consistency Trajectory Reinforcement Learning with Fewer Decoding Ste…☆28Mar 9, 2026Updated last month
- [ICLR 2026] Official Implementation of ProxyThinker: Test-Time Guidance through Small Visual Reasoners.☆21Sep 24, 2025Updated 7 months ago
- Socratic-Zero is a fully autonomous framework that generates high-quality training data for mathematical reasoning☆36Oct 26, 2025Updated 6 months ago
- FlexiFilm: Long Video Generation with Flexible Conditions☆31May 1, 2024Updated 2 years ago
- The implementation for FREE-Merging: Fourier Transform for Model Merging with Lightweight Experts (ICCV25)☆15Jun 26, 2025Updated 10 months ago
- Adapter-X: A Novel General Parameter-Efficient Fine-Tuning Framework for Vision☆11Jul 22, 2024Updated last year
- Reasoning Activation in LLMs via Small Model Transfer (NeurIPS 2025)☆22Oct 16, 2025Updated 6 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Code for Research Project TLDR☆25Jul 28, 2025Updated 9 months ago
- Diagnostic Framework for LLMs and MLLMs☆36Mar 2, 2026Updated 2 months ago
- Fault Trees on R☆10Aug 26, 2023Updated 2 years ago
- This is the open-source code for TokenCarve.☆26Jan 23, 2026Updated 3 months ago
- Official repository for Activation-Informed Merging (AIM) of Large Language Models☆23Feb 10, 2025Updated last year
- [NeurIPS'25] ContextAgent: Context-Aware Proactive LLM Agents with Open-World Sensory Perceptions☆38Dec 7, 2025Updated 4 months ago
- Python SDK for CMDOP agent interaction☆42Apr 7, 2026Updated 3 weeks ago
- DUT编译原理课程设计,定义了一个C语言子集,包含词法分析,语法分析,语义分析,解释执行以及相应的图形界面☆12Nov 13, 2020Updated 5 years ago
- CVPR(Highlight) Decoupled Distillation to Erase: A General Unlearning Method for Any Class-centric Tasks☆21Jul 22, 2025Updated 9 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Official reposity for paper "High-Dimension Human Value Representation in Large Language Models" (NAACL'25 Main)☆23Jul 9, 2024Updated last year
- OpenFTA☆14Jun 14, 2013Updated 12 years ago
- Efficient and Effective Weight-Ensembling Mixture of Experts for Multi-Task Model Merging. Arxiv, 2024.☆16Oct 28, 2024Updated last year
- A LangChain Deep Agent that helps me monitor my X feed.☆51Mar 13, 2026Updated last month
- Implementation of GradLoc from the Tencent Hunyuan blog "Stabilizing RLVR via Token-level Gradient Diagnosis and Layerwise Clipping".☆94Feb 16, 2026Updated 2 months ago
- ☆108Apr 24, 2026Updated last week
- 一个开源数学大模型项目,旨在探索大模型是否具有数学创造能力,以及大模型在前沿数学研究中的潜在能力。☆18Mar 19, 2026Updated last month
- SG-Bench: Evaluating LLM Safety Generalization Across Diverse Tasks and Prompt Types☆25Nov 29, 2024Updated last year
- ☆14May 4, 2024Updated last year
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- ☆47Updated this week
- AAAI2025☆12Apr 18, 2025Updated last year
- official code repo for paper "Merging Models on the Fly Without Retraining: A Sequential Approach to Scalable Continual Model Merging"☆25Oct 11, 2025Updated 6 months ago
- Professor and Group List of CS☆10Mar 12, 2024Updated 2 years ago
- a PL/0 compiler☆16Aug 25, 2019Updated 6 years ago
- GEMS: Agent-Native Multimodal Generation with Memory and Skills☆123Apr 1, 2026Updated last month
- This is the official repository for the ICLR 2025 Conference Paper - Fast and Slow Streams for Online Time Series Forecasting without Inf…☆16Apr 30, 2025Updated last year