Understanding the interplay between memorization and generalization in neural networks, featuring MAT, a learning algorithm to enhance robustness by mitigating spurious correlations.
☆40Dec 19, 2024Updated last year
Alternatives and similar repositories for Pitfalls-of-Memorization
Users that are interested in Pitfalls-of-Memorization are comparing it to the libraries listed below
Sorting:
- Official implementation of "Multi-scale Feature Learning Dynamics: Insights for Double Descent".☆17Jun 10, 2022Updated 3 years ago
- Discovering environments with XRM☆16Dec 6, 2024Updated last year
- Code for "Can Retriever-Augmented Language Models Reason? The Blame Game Between the Retriever and the Language Model", EMNLP Findings 20…☆28Nov 2, 2023Updated 2 years ago
- The source code for "MG-BERT: Multi-Graph Augmented BERT for Masked Language Modeling" paper (NAACL 2021, TextGraphs-15).☆12Jun 11, 2021Updated 4 years ago
- ☆14Jan 11, 2024Updated 2 years ago
- ☆12Jan 11, 2018Updated 8 years ago
- Concise Reasoning via Reinforcement Learning☆13Apr 16, 2025Updated 11 months ago
- Implementation of the paper - Fast Training of Convolutional Networks through FFTs (CUDA for parallelization)☆10May 8, 2020Updated 5 years ago
- Structure-Aware Feature Stylization for Domain Generalization☆12Oct 7, 2023Updated 2 years ago
- 🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.☆13Mar 16, 2023Updated 3 years ago
- A Persian font family for changing the page number in Microsoft Word to Persian or Arabic digits☆12Apr 24, 2019Updated 6 years ago
- TACL 2025: Investigating Adversarial Trigger Transfer in Large Language Models☆19Aug 17, 2025Updated 7 months ago
- This repository contains the dataset and code for our ACL'23 publication: "MatSci-NLP: Evaluating Scientific Language Models on Materials…☆16Nov 21, 2023Updated 2 years ago
- ☆11Jul 25, 2021Updated 4 years ago
- Simple 3-layer feed-forward neural network using back-propagation to recognize MNIST digits☆10Jul 10, 2017Updated 8 years ago
- CAMMARL: Conformal Action Modeling in Multi Agent Reinforcement Learning☆15Jun 24, 2024Updated last year
- This is the repository of the Dense Hierarchical Retrieval for Open-Domain Question Answering☆14Dec 23, 2021Updated 4 years ago
- These are the materials I have prepared as a TA to teach to students who have enrolled in the Deep Learning course.☆14Nov 23, 2019Updated 6 years ago
- GRPO Training Script for Qwen Model on GSM8K Dataset. This script trains a Qwen model using the GRPO (Generalized Reinforcement Policy Op…☆28Dec 11, 2025Updated 3 months ago
- Tools to connect to and interact with the Mila cluster☆79Feb 26, 2026Updated 3 weeks ago
- Implementation of the BatchTopK activation function for training sparse autoencoders (SAEs)☆62Jul 24, 2025Updated 7 months ago
- Gradient Starvation: A Learning Proclivity in Neural Networks☆61Jan 10, 2021Updated 5 years ago
- python project template for personal projects! 🙋♀️☆11Nov 28, 2020Updated 5 years ago
- ☆19Jul 30, 2024Updated last year
- [PR 2024] TFS-ViT: Token-Level Feature Stylization for Domain Generalization☆25Mar 29, 2023Updated 2 years ago
- ☆32Oct 6, 2024Updated last year
- PyTorch reimplementation of REALM and ORQA☆22Feb 3, 2022Updated 4 years ago
- Recall to Imagine, a model-based RL algorithm with superhuman memory. Oral (1.2%) @ ICLR 2024☆79Jun 4, 2024Updated last year
- Code for "RADCoT: Retrieval-Augmented Distillation to Specialization Models for Generating Chain-of-Thoughts in Query Expansion", LREC-CO…☆11May 25, 2024Updated last year
- Implementation of "Adversarial purification with Score-based generative models", ICML 2021☆30Oct 24, 2021Updated 4 years ago
- ☆11Jan 21, 2019Updated 7 years ago
- ☆20Jun 3, 2022Updated 3 years ago
- Code for the paper "VinePPO: Unlocking RL Potential For LLM Reasoning Through Refined Credit Assignment"☆188May 25, 2025Updated 9 months ago
- MIPS syntax highlightning package for sublime text 2☆13May 14, 2018Updated 7 years ago
- [NeurIPS'22] PyTorch library to compare similarity between NN representations☆13Feb 27, 2025Updated last year
- A utility for storing and reading files for Korean LM training 💾☆35Oct 15, 2025Updated 5 months ago
- We can crawl NaverBlog, Twitter, Youtube!!☆14Sep 13, 2019Updated 6 years ago
- Single File, Single GPU, From Scratch, Efficient, Full Parameter Tuning library for "RL for LLMs"☆603Oct 7, 2025Updated 5 months ago
- [ICCV2025 highlight]Rectifying Magnitude Neglect in Linear Attention☆59Jul 24, 2025Updated 7 months ago