RFTT: Reasoning with Reinforced Functional Token Tuning
☆29Feb 12, 2026Updated 4 months ago
Alternatives and similar repositories for RFTT
Users that are interested in RFTT are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Consistent Paths Lead to Truth: Self-Rewarding Reinforcement Learning for LLM Reasoning☆25Jun 25, 2025Updated 11 months ago
- Official repo for "ProSec: Fortifying Code LLMs with Proactive Security Alignment"☆17Feb 26, 2026Updated 3 months ago
- ☆12Jul 4, 2024Updated last year
- to release the source code for reproducing the results reported in our paper: https://arxiv.org/abs/2409.17550☆14Nov 15, 2024Updated last year
- 求是潮网站后端开发入门☆15Oct 10, 2014Updated 11 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- [COLM 2025] Official code for "When To Solve, When To Verify: Compute-Optimal Problem Solving and Generative Verification for LLM Reasoni…☆15Oct 31, 2025Updated 7 months ago
- TIGeR: Tool-Integrated Geometric Reasoning in Vision-Language Models for Robotics☆21Nov 18, 2025Updated 6 months ago
- [ICLR 2025] ELICIT: LLM Augmentation Via External In-context Capability☆14Mar 11, 2025Updated last year
- ☆14Jan 19, 2026Updated 4 months ago
- ☆11Oct 2, 2023Updated 2 years ago
- ☆37Nov 14, 2024Updated last year
- Code for Representation Bending Paper☆17Jul 15, 2025Updated 10 months ago
- [TOG 2025] Order Matters: Learning Element Ordering for Graphic Design Generation☆24Aug 5, 2025Updated 10 months ago
- ☆14Feb 12, 2024Updated 2 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- [ACL 2024] Code for the paper "ALaRM: Align Language Models via Hierarchical Rewards Modeling"☆25Mar 28, 2024Updated 2 years ago
- Hierarchical Attention Network based Explainable Knowledge Tracing☆10May 18, 2022Updated 4 years ago
- ☆18Apr 24, 2024Updated 2 years ago
- The open source implementation of "Connecting Large Language Models with Evolutionary Algorithms Yields Powerful Prompt Optimizers"☆19Mar 11, 2024Updated 2 years ago
- Ongoing research training transformer models at scale☆18Jul 27, 2023Updated 2 years ago
- Code for paper "Beyond Natural Language: LLMs Leveraging Alternative Formats for Enhanced Reasoning and Communication"☆23Mar 30, 2024Updated 2 years ago
- ☆13Mar 24, 2023Updated 3 years ago
- This is a pip package implementing Reinforcement Learning algorithms in non-stationary environments supported by the OpenAI Gym toolkit.☆16Jun 28, 2024Updated last year
- Opus builds a tank game☆24Jan 19, 2026Updated 4 months ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- ☆10Nov 6, 2021Updated 4 years ago
- [NeurIPS 2024] The official repository of "Distribution-Aware Data Expansion with Diffusion Models".☆17Dec 15, 2025Updated 5 months ago
- [ICLR 2025] Unintentional Unalignment: Likelihood Displacement in Direct Preference Optimization☆32Jan 7, 2026Updated 5 months ago
- Complexity Based Prompting for Multi-Step Reasoning☆17Mar 10, 2023Updated 3 years ago
- Repository containing common Makefiles for setting up conda environments.☆10Feb 10, 2023Updated 3 years ago
- ☆27Sep 11, 2024Updated last year
- The official Implementation for TKDE paper "Individual and Structural Graph Information Bottlenecks for Out-of-Distribution Generalizatio…☆14Aug 6, 2023Updated 2 years ago
- This extension provides inference-time optimization techniques to enhance diffusion-based image generation quality through random search …☆23Feb 27, 2025Updated last year
- ☆34Oct 13, 2025Updated 8 months ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- PyTorch Implementation for the paper "Let Me Help You! Neuro-Symbolic Short-Context Action Anticipation" accepted to RA-L'24.☆12Nov 27, 2024Updated last year
- First instruction-tuning dataset distilled from Claude2 (52k Alpaca prompts)!☆13Oct 22, 2023Updated 2 years ago
- ☆10Oct 11, 2022Updated 3 years ago
- [AAAI 2025] Augmenting Math Word Problems via Iterative Question Composing (https://arxiv.org/abs/2401.09003)☆23Oct 2, 2025Updated 8 months ago
- PyTorch codes for the paper "An Empirical Study of Multimodal Model Merging"☆37Oct 11, 2023Updated 2 years ago
- Framework for Cost-Effective Language Model Choice☆16Dec 12, 2023Updated 2 years ago
- [ICML 2025] Satori: Reinforcement Learning with Chain-of-Action-Thought Enhances LLM Reasoning via Autoregressive Search☆114Jun 3, 2025Updated last year