[ICLR 2025] This repository contains the code to reproduce the results from our paper From Sparse Dependence to Sparse Attention: Unveiling How Chain-of-Thought Enhances Transformer Sample Efficiency.
☆12Mar 7, 2025Updated last year
Alternatives and similar repositories for Learning-Parity-with-CoT
Users that are interested in Learning-Parity-with-CoT are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆12Nov 13, 2024Updated last year
- ☆20Nov 28, 2024Updated last year
- Part of my Udacity Data Science Nanodegree☆10Apr 3, 2020Updated 6 years ago
- ☆13Mar 22, 2023Updated 3 years ago
- ☆31Nov 16, 2025Updated 5 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- ☆17May 31, 2023Updated 2 years ago
- This script automates the process of unlocking Apple ID accounts by solving captcha challenges, verifying account details, and resetting …☆14Jan 24, 2026Updated 2 months ago
- This is the PyTorch1.0 implement of SENet to train on NWPU-RESISC45 dataset☆15Apr 12, 2019Updated 7 years ago
- Direct preference optimization with f-divergences.☆16Nov 3, 2024Updated last year
- ☆34Jul 5, 2023Updated 2 years ago
- Official Implementation of "Learning to Refuse: Towards Mitigating Privacy Risks in LLMs"☆10Dec 13, 2024Updated last year
- Paper list of compositional zero-shot learning☆11Jul 5, 2022Updated 3 years ago
- Official release of code for the paper RL is a hammer and LLMs are nails A simple RL approach to stronger prompt injection attacks☆43Updated this week
- 免费梯子,免费VPN,真正免费的的VPN,shadowsocks,v2rey,官网地址www.dragonvpn.cc☆10Sep 4, 2024Updated last year
- Deploy open-source AI quickly and easily - Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- Documentation at☆14Mar 27, 2025Updated last year
- Look Back to Reason Forward: Revisitable Memory for Long-Context LLM Agents☆31Updated this week
- Code for the paper: Why Transformers Need Adam: A Hessian Perspective☆65Mar 11, 2025Updated last year
- ☆12Nov 18, 2022Updated 3 years ago
- ☆10Jul 6, 2021Updated 4 years ago
- UFT: Unifying Supervised and Reinforcement Fine-Tuning☆28Jun 30, 2025Updated 9 months ago
- Official repository of paper "RNNs Are Not Transformers (Yet): The Key Bottleneck on In-context Retrieval"☆27Apr 17, 2024Updated 2 years ago
- Directional Preference Alignment☆61Sep 23, 2024Updated last year
- Source code for "Taming GANs with Lookahead–Minmax", ICLR 2021.☆15Mar 28, 2021Updated 5 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Official Implementation of "Personalized Pieces: Efficient Personalized Large Language Models through Collaborative Efforts" at EMNLP 202…☆13Oct 27, 2024Updated last year
- [ICLR 2025 SSI-FM] Self-Taught Self-Correction for Small Language Models☆11Sep 19, 2025Updated 7 months ago
- Code for the paper "Distinguishing the Knowable from the Unknowable with Language Models"☆11Apr 15, 2024Updated 2 years ago
- Project of ACL 2025 "UAlign: Leveraging Uncertainty Estimations for Factuality Alignment on Large Language Models"☆14Mar 25, 2025Updated last year
- JAX implementation of "Fine-Tuning Language Models with Just Forward Passes"☆19Jun 10, 2023Updated 2 years ago
- Beyond Empathy: Integrating Diagnostic and Therapeutic Reasoning with Large Language Models for Mental Health Counseling☆36Jan 24, 2026Updated 2 months ago
- ☆10Dec 11, 2025Updated 4 months ago
- Implementations of the algorithms described in the paper: On the Convergence Theory for Hessian-Free Bilevel Algorithms.☆11Nov 1, 2024Updated last year
- Optimization algorithm which fits a ResNet to CIFAR-10 5x faster than SGD / Adam (with terrible generalization)☆14Oct 20, 2023Updated 2 years ago
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- This is the official implementation of the ICML 2023 paper - Can Forward Gradient Match Backpropagation ?☆13May 31, 2023Updated 2 years ago
- Maximum mean discrepancy comparisons for single cell profiling experiments☆20Feb 9, 2022Updated 4 years ago
- Experiments with Super-Universal Newton method.☆13Aug 12, 2022Updated 3 years ago
- Official code for "Decoding-Time Language Model Alignment with Multiple Objectives".☆29Oct 30, 2024Updated last year
- Source code for EMNLP2022 paper "Finding Skill Neurons in Pre-trained Transformers via Prompt Tuning".☆18Mar 13, 2023Updated 3 years ago
- ☆23Oct 17, 2022Updated 3 years ago
- codes and plots for "Active-Dormant Attention Heads: Mechanistically Demystifying Extreme-Token Phenomena in LLMs"☆10Dec 30, 2024Updated last year