The reproduct of the paper - Aligner: Achieving Efficient Alignment through Weak-to-Strong Correction
☆22May 29, 2024Updated last year
Alternatives and similar repositories for aligner-replication
Users that are interested in aligner-replication are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- 33B Chinese LLM, DPO QLORA, 100K context, AirLLM 70B inference with single 4GB GPU☆13May 5, 2024Updated last year
- Trust Region Preference Approximation: A simple and stable reinforcement learning algorithm for LLM reasoning☆15Jun 28, 2025Updated 10 months ago
- Official code for ACT: Empowering Decision Transformer with Dynamic Programming via Advantage Conditioning (AAAI'24)☆17Feb 10, 2024Updated 2 years ago
- ☆30Oct 8, 2025Updated 6 months ago
- [EMNLP2023]: MIRACLE: Towards Personalized Dialogue Generation with Latent-Space Multiple Personal Attribute Control☆12Nov 11, 2023Updated 2 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Accelerating RL for LLM Reasoning with Optimal Advantage Regression☆40May 30, 2025Updated 11 months ago
- Reimplementation of https://github.com/montemac/algebraic_value_editing in pure PyTorch for efficiency on large models☆11Jun 28, 2023Updated 2 years ago
- Dataset for EMNLP'23 Paper "DocTrack: A Visually-Rich Document Dataset Really Aligned with Human Eye Movement for Machine Reading"☆11Oct 25, 2023Updated 2 years ago
- ☆11Jan 23, 2019Updated 7 years ago
- The code and data for the paper JiuZhang3.0☆49May 26, 2024Updated last year
- Official implementation of "OffsetBias: Leveraging Debiased Data for Tuning Evaluators"☆26Sep 11, 2024Updated last year
- The official code for "OG-HFYOLO :Orientation Gradient Guidance and Heterogeneous Feature Fusion For Deformation Table Cell Instance Segm…☆13Jul 28, 2025Updated 9 months ago
- Direct preference optimization with f-divergences.☆16Nov 3, 2024Updated last year
- Challenge LLMs to Reason About Reasoning: A Benchmark to Unveil Cognitive Depth in LLMs☆52Jul 10, 2024Updated last year
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- The official implementation of Self-Exploring Language Models (SELM)☆63Jun 4, 2024Updated last year
- 4 bits quantization of LLaMa using GPTQ☆12Jun 2, 2023Updated 2 years ago
- code for "Generative News Recommendation"☆15May 31, 2024Updated last year
- Code repo for "Harnessing Negative Signals: Reinforcement Distillation from Teacher Data for LLM Reasoning"☆33Jul 25, 2025Updated 9 months ago
- ICML 2024 - Official Repository for EXO: Towards Efficient Exact Optimization of Language Model Alignment☆56Jun 16, 2024Updated last year
- Tidy autoregressive inference in JAX☆15Sep 1, 2025Updated 8 months ago
- GSM-Plus: Data, Code, and Evaluation for Enhancing Robust Mathematical Reasoning in Math Word Problems.☆64Jul 8, 2024Updated last year
- Tools for content datamining and NLP at scale☆45Jun 20, 2024Updated last year
- ☆16Jun 18, 2022Updated 3 years ago
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- ☆16Sep 30, 2023Updated 2 years ago
- ☆116Jan 21, 2025Updated last year
- ☆16Apr 28, 2023Updated 3 years ago
- The OlymMATH dataset☆24Jun 1, 2025Updated 11 months ago
- [NeurIPS 2025] UI-Genie: A Self-Improving Approach for Iteratively Boosting MLLM-based Mobile GUI Agents☆56Nov 27, 2025Updated 5 months ago
- Official repository for ORPO☆483May 31, 2024Updated last year
- Accepted by ACL 2025☆30Aug 13, 2025Updated 8 months ago
- Implementation of the ICML 2024 paper "Training Large Language Models for Reasoning through Reverse Curriculum Reinforcement Learning" pr…☆116Feb 9, 2024Updated 2 years ago
- Decoupled Neural Interfaces Using Synthetic Gradients - under develeopment☆11Jun 27, 2025Updated 10 months ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- ☆19Jul 16, 2020Updated 5 years ago
- Stochastic trace estimation using JAX☆17Aug 20, 2025Updated 8 months ago
- A demo project of using ChatGPT to create Slate UI with TAPython in Unreal Engine 5. TAPython uses JSON for the user interface, which i…☆17Dec 30, 2023Updated 2 years ago
- ☆25Mar 4, 2024Updated 2 years ago
- Xwin-LM: Powerful, Stable, and Reproducible LLM Alignment☆1,037May 31, 2024Updated last year
- CypherBench: Towards Precise Retrieval over Full-scale Modern Knowledge Graphs in the LLM Era☆35Jun 18, 2025Updated 10 months ago
- Official code for "MAmmoTH2: Scaling Instructions from the Web" [NeurIPS 2024]☆149Oct 27, 2024Updated last year