[NeurIPS 2023 Spotlight] Temperature Balancing, Layer-wise Weight Analysis, and Neural Network Training
☆36Apr 7, 2025Updated last year
Alternatives and similar repositories for TempBalance
Users that are interested in TempBalance are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆19Nov 10, 2024Updated last year
- [NeurIPS 2024] AlphaPruning: Using Heavy-Tailed Self Regularization Theory for Improved Layer-wise Pruning of Large Language Models☆34Jun 9, 2025Updated last year
- [NeurIPS 2021] code for "Taxonomizing local versus global structure in neural network loss landscapes" https://arxiv.org/abs/2107.11228☆20Jan 7, 2022Updated 4 years ago
- Open source code for ICML 2025 Paper: Eigenspectrum Analysis of Neural Networks without Aspect Ratio Bias☆43Nov 14, 2025Updated 6 months ago
- Dataset and code for the paper MentalManip: A Dataset For Fine-grained Analysis of Mental Manipulation in Conversations (ACL'24).☆26May 2, 2025Updated last year
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- This is the official code for the paper "Safety Tax: Safety Alignment Makes Your Large Reasoning Models Less Reasonable".☆34Mar 11, 2025Updated last year
- ☆17Feb 3, 2022Updated 4 years ago
- Welcome to the 'In Context Learning Theory' Reading Group☆31Nov 8, 2024Updated last year
- The "CoT-ICL Lab" framework for meta-training transformers☆11Jun 3, 2026Updated last week
- Debiasing Through Data Attribution☆13May 23, 2024Updated 2 years ago
- This code implements ProtoViT, a novel approach that combines Vision Transformers with prototype-based learning to create interpretable i…☆42May 19, 2025Updated last year
- Lion and Adam optimization comparison☆65Feb 23, 2023Updated 3 years ago
- Experiments with reasoning models, training techniques, papers☆30Updated this week
- Gated Pretrained Transformer model for robust denoised sequence-to-sequence modelling☆10May 29, 2021Updated 5 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Code for Characterizing Scaling and Transfer Learning Behavior of FNO in SciML☆54May 31, 2023Updated 3 years ago
- H3M-SSMoEs: Hypergraph-based Multimodal Learning with LLM Reasoning and Style-Structured Mixture of Experts☆29Feb 20, 2026Updated 3 months ago
- [ICLR 2025] On Evluating the Durability of Safegurads for Open-Weight LLMs☆13Jun 20, 2025Updated 11 months ago
- Two-party Privacy-preserving Neural Network Training using Split Learning and Homomorphic Encryption (CKKS Scheme)☆12Sep 23, 2025Updated 8 months ago
- Offcial Repo of Paper "Eliminating Position Bias of Language Models: A Mechanistic Approach""☆23Jun 13, 2025Updated 11 months ago
- [NeurIPS'24] "NeuralFuse: Learning to Recover the Accuracy of Access-Limited Neural Network Inference in Low-Voltage Regimes" by Hao-Lun …☆10Sep 18, 2025Updated 8 months ago
- The official implementation of the paper "Large Scale Knowledge Washing"☆10Jun 12, 2024Updated 2 years ago
- Use contrastive learning to train a large language model (LLM) as a retriever☆12Jul 19, 2024Updated last year
- ☆32Nov 4, 2024Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Implementation for the protocols described in https://eprint.iacr.org/2023/1700☆14Apr 29, 2026Updated last month
- Code release for MPCViT accepted by ICCV 2023☆16Jan 6, 2025Updated last year
- ☆13Jan 30, 2021Updated 5 years ago
- [ICLR 2025] This is the official implementation for the paper: "Large Language Models Meet Symbolic Provers for Logical Reasoning Evaluat…☆45Jun 11, 2025Updated last year
- Code for Generalization Guarantees for (Multi-Modal) Imitation Learning☆11Jul 14, 2022Updated 3 years ago
- The repository of the paper "REEF: Representation Encoding Fingerprints for Large Language Models," aims to protect the IP of open-source…☆78Jan 16, 2025Updated last year
- fast trainer for educational purposes☆26Jun 4, 2026Updated last week
- ☆116Jan 21, 2025Updated last year
- ☆14May 23, 2023Updated 3 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- [ICML 2024] Junk DNA Hypothesis: A Task-Centric Angle of LLM Pre-trained Weights through Sparsity; Lu Yin*, Ajay Jaiswal*, Shiwei Liu, So…☆16Apr 21, 2025Updated last year
- ☆10Oct 11, 2022Updated 3 years ago
- Towards Better Graph Representation Learning with Parameterized Decomposition & Filtering☆13Aug 22, 2023Updated 2 years ago
- Concise Reasoning via Reinforcement Learning☆13Apr 16, 2025Updated last year
- The implementation of the IEEE S&P 2024 paper MM-BD: Post-Training Detection of Backdoor Attacks with Arbitrary Backdoor Pattern Types Us…☆16May 12, 2024Updated 2 years ago
- Provide implementations and pre-trained models of MobileNet-v1, v2, and v3☆16Dec 11, 2020Updated 5 years ago
- ☆18Aug 15, 2022Updated 3 years ago