[ICLR 2025] This repository contains the code to reproduce the results from our paper From Sparse Dependence to Sparse Attention: Unveiling How Chain-of-Thought Enhances Transformer Sample Efficiency.
☆12Mar 7, 2025Updated last year
Alternatives and similar repositories for Learning-Parity-with-CoT
Users that are interested in Learning-Parity-with-CoT are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆12Nov 13, 2024Updated last year
- ☆20Nov 28, 2024Updated last year
- Part of my Udacity Data Science Nanodegree☆10Apr 3, 2020Updated 5 years ago
- ☆13Mar 22, 2023Updated 3 years ago
- ☆29Nov 16, 2025Updated 4 months ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- ☆17May 31, 2023Updated 2 years ago
- This is the PyTorch1.0 implement of SENet to train on NWPU-RESISC45 dataset☆15Apr 12, 2019Updated 6 years ago
- This script automates the process of unlocking Apple ID accounts by solving captcha challenges, verifying account details, and resetting …☆14Jan 24, 2026Updated 2 months ago
- Official release of code for the paper RL is a hammer and LLMs are nails A simple RL approach to stronger prompt injection attacks☆42Feb 11, 2026Updated last month
- Direct preference optimization with f-divergences.☆16Nov 3, 2024Updated last year
- ☆34Jul 5, 2023Updated 2 years ago
- Official Implementation of "Learning to Refuse: Towards Mitigating Privacy Risks in LLMs"☆10Dec 13, 2024Updated last year
- Paper list of compositional zero-shot learning☆11Jul 5, 2022Updated 3 years ago
- UFT: Unifying Supervised and Reinforcement Fine-Tuning☆27Jun 30, 2025Updated 8 months ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Documentation at☆14Mar 27, 2025Updated last year
- Look Back to Reason Forward: Revisitable Memory for Long-Context LLM Agents☆27Mar 9, 2026Updated 3 weeks ago
- Code for the paper: Why Transformers Need Adam: A Hessian Perspective☆63Mar 11, 2025Updated last year
- Beyond Empathy: Integrating Diagnostic and Therapeutic Reasoning with Large Language Models for Mental Health Counseling☆34Jan 24, 2026Updated 2 months ago
- ☆12Nov 18, 2022Updated 3 years ago
- ☆10Jul 6, 2021Updated 4 years ago
- Official repository of paper "RNNs Are Not Transformers (Yet): The Key Bottleneck on In-context Retrieval"☆27Apr 17, 2024Updated last year
- Directional Preference Alignment☆59Sep 23, 2024Updated last year
- Official Implementation of "Personalized Pieces: Efficient Personalized Large Language Models through Collaborative Efforts" at EMNLP 202…☆13Oct 27, 2024Updated last year
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Source code for "Taming GANs with Lookahead–Minmax", ICLR 2021.☆15Mar 28, 2021Updated 5 years ago
- [ICLR 2025 SSI-FM] Self-Taught Self-Correction for Small Language Models☆11Sep 19, 2025Updated 6 months ago
- Code for the paper "Distinguishing the Knowable from the Unknowable with Language Models"☆11Apr 15, 2024Updated last year
- Project of ACL 2025 "UAlign: Leveraging Uncertainty Estimations for Factuality Alignment on Large Language Models"☆14Mar 25, 2025Updated last year
- JAX implementation of "Fine-Tuning Language Models with Just Forward Passes"☆19Jun 10, 2023Updated 2 years ago
- ☆10Dec 11, 2025Updated 3 months ago
- Implementations of the algorithms described in the paper: On the Convergence Theory for Hessian-Free Bilevel Algorithms.☆11Nov 1, 2024Updated last year
- Optimization algorithm which fits a ResNet to CIFAR-10 5x faster than SGD / Adam (with terrible generalization)☆14Oct 20, 2023Updated 2 years ago
- Maximum mean discrepancy comparisons for single cell profiling experiments☆20Feb 9, 2022Updated 4 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- This is the official implementation of the ICML 2023 paper - Can Forward Gradient Match Backpropagation ?☆13May 31, 2023Updated 2 years ago
- Experiments with Super-Universal Newton method.☆13Aug 12, 2022Updated 3 years ago
- Official code for "Decoding-Time Language Model Alignment with Multiple Objectives".☆29Oct 30, 2024Updated last year
- Source code for EMNLP2022 paper "Finding Skill Neurons in Pre-trained Transformers via Prompt Tuning".☆18Mar 13, 2023Updated 3 years ago
- ☆23Oct 17, 2022Updated 3 years ago
- codes and plots for "Active-Dormant Attention Heads: Mechanistically Demystifying Extreme-Token Phenomena in LLMs"☆10Dec 30, 2024Updated last year
- [ACL'24] Beyond One-Preference-Fits-All Alignment: Multi-Objective Direct Preference Optimization☆97Aug 20, 2024Updated last year