[NeurIPS 2023 Spotlight] Temperature Balancing, Layer-wise Weight Analysis, and Neural Network Training
☆36Apr 7, 2025Updated last year
Alternatives and similar repositories for TempBalance
Users that are interested in TempBalance are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆18Nov 10, 2024Updated last year
- [NeurIPS 2024] AlphaPruning: Using Heavy-Tailed Self Regularization Theory for Improved Layer-wise Pruning of Large Language Models☆33Jun 9, 2025Updated 10 months ago
- [NeurIPS 2021] code for "Taxonomizing local versus global structure in neural network loss landscapes" https://arxiv.org/abs/2107.11228☆20Jan 7, 2022Updated 4 years ago
- Dataset and code for the paper MentalManip: A Dataset For Fine-grained Analysis of Mental Manipulation in Conversations (ACL'24).☆26May 2, 2025Updated 11 months ago
- [CVPR23] "Towards Compositional Adversarial Robustness: Generalizing Adversarial Training to Composite Semantic Perturbations" by Lei Hsi…☆24Sep 17, 2025Updated 6 months ago
- Wordpress hosting with auto-scaling on Cloudways • AdFully Managed hosting built for WordPress-powered businesses that need reliable, auto-scalable hosting. Cloudways SafeUpdates now available.
- This is the official code for the paper "Safety Tax: Safety Alignment Makes Your Large Reasoning Models Less Reasonable".☆31Mar 11, 2025Updated last year
- ☆16Feb 3, 2022Updated 4 years ago
- AlgoTune is a NeurIPS 2025 benchmark made up of 154 math, physics, and computer science problems. The goal is write code that solves each…☆95Mar 12, 2026Updated 3 weeks ago
- Welcome to the 'In Context Learning Theory' Reading Group☆30Nov 8, 2024Updated last year
- [ACL 2025] Official implementation of the "CoT-ICL Lab" framework☆11Oct 10, 2025Updated 6 months ago
- Debiasing Through Data Attribution☆13May 23, 2024Updated last year
- Lion and Adam optimization comparison☆64Feb 23, 2023Updated 3 years ago
- Official implementation of Scaling Laws in Patchification: An Image Is Worth 50,176 Tokens And More☆25Feb 25, 2025Updated last year
- Experiments with reasoning models, training techniques, papers☆28Updated this week
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- The official repository for AdaMuon☆37Aug 27, 2025Updated 7 months ago
- Code for Characterizing Scaling and Transfer Learning Behavior of FNO in SciML☆53May 31, 2023Updated 2 years ago
- Gated Pretrained Transformer model for robust denoised sequence-to-sequence modelling☆10May 29, 2021Updated 4 years ago
- Code associated with ICML (2024). "Defense against Backdoor Attack on Pre-trained Language Models via Head Pruning and Attention Normaliz…☆10Feb 22, 2026Updated last month
- ☆44Oct 1, 2024Updated last year
- TorFS is a plugin that enables RocksDB to access FDP SSDs☆14Jul 16, 2025Updated 8 months ago
- Code for the paper: Fast and Private Inference of Deep Neural Networks by Co-designing Activation Functions☆11Mar 13, 2024Updated 2 years ago
- H3M-SSMoEs: Hypergraph-based Multimodal Learning with LLM Reasoning and Style-Structured Mixture of Experts☆26Feb 20, 2026Updated last month
- [ICLR 2025] On Evluating the Durability of Safegurads for Open-Weight LLMs☆13Jun 20, 2025Updated 9 months ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Two-party Privacy-preserving Neural Network Training using Split Learning and Homomorphic Encryption (CKKS Scheme)☆11Sep 23, 2025Updated 6 months ago
- ☆14Aug 2, 2023Updated 2 years ago
- Offcial Repo of Paper "Eliminating Position Bias of Language Models: A Mechanistic Approach""☆21Jun 13, 2025Updated 9 months ago
- Python 3 support for the MS COCO caption evaluation tools☆14Jun 14, 2024Updated last year
- The official implementation of the paper "Large Scale Knowledge Washing"☆10Jun 12, 2024Updated last year
- Use contrastive learning to train a large language model (LLM) as a retriever☆12Jul 19, 2024Updated last year
- Implementation for the protocols described in https://eprint.iacr.org/2023/1700☆14Jan 9, 2025Updated last year
- [ICML 2024] Official Repository for the paper "Transformers Get Stable: An End-to-End Signal Propagation Theory for Language Models"☆10Jul 19, 2024Updated last year
- [ICLR 2025] This is the official implementation for the paper: "Large Language Models Meet Symbolic Provers for Logical Reasoning Evaluat…☆44Jun 11, 2025Updated 10 months ago
- NordVPN Threat Protection Pro™ • AdTake your cybersecurity to the next level. Block phishing, malware, trackers, and ads. Lightweight app that works with all browsers.
- Code of the paper: Debiasing Meta-Gradient Reinforcement Learning by Learning the Outer Value Function☆13Nov 22, 2022Updated 3 years ago
- The repository of the paper "REEF: Representation Encoding Fingerprints for Large Language Models," aims to protect the IP of open-source…☆77Jan 16, 2025Updated last year
- fast trainer for educational purposes☆25Updated this week
- ☆116Jan 21, 2025Updated last year
- ☆14May 23, 2023Updated 2 years ago
- A basic implementation of a SAT attack on logic locking.☆13Jun 30, 2021Updated 4 years ago
- [MSST '24] Prophet: Optimizing LSM-Based Key-Value Store on ZNS SSDs with File Lifetime Prediction and Compaction Compensation.☆14Apr 20, 2024Updated last year