facebookresearch / palLinks
PAL: Predictive Analysis & Laws of Large Language Models
☆39Updated 11 months ago
Alternatives and similar repositories for pal
Users that are interested in pal are comparing it to the libraries listed below
Sorting:
- Recycling diverse models☆46Updated 2 years ago
- ICLR 2025 - official implementation for "I-Con: A Unifying Framework for Representation Learning"☆117Updated 5 months ago
- ☆79Updated 3 weeks ago
- Minimum Description Length probing for neural network representations☆20Updated 10 months ago
- Towards Understanding the Mixture-of-Experts Layer in Deep Learning☆33Updated 2 years ago
- A regression-alike loss to improve numerical reasoning in language models - ICML 2025☆27Updated 3 months ago
- Model Stock: All we need is just a few fine-tuned models☆127Updated 4 months ago
- ☆59Updated last year
- Implementation of the general framework for AMIE, from the paper "Towards Conversational Diagnostic AI", out of Google Deepmind☆72Updated last year
- One Initialization to Rule them All: Fine-tuning via Explained Variance Adaptation☆45Updated last month
- Implementation of Mind Evolution, Evolving Deeper LLM Thinking, from Deepmind☆57Updated 6 months ago
- Code for PHATGOOSE introduced in "Learning to Route Among Specialized Experts for Zero-Shot Generalization"☆91Updated last year
- Code and pretrained models for the paper: "MatMamba: A Matryoshka State Space Model"☆61Updated last year
- Sparse and discrete interpretability tool for neural networks☆64Updated last year
- Implementation of OpenAI's 'Grokking: Generalization Beyond Overfitting on Small Algorithmic Datasets' paper.☆39Updated 2 years ago
- Train, tune, and infer Bamba model☆137Updated 6 months ago
- Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment☆60Updated last year
- Repository containing awesome resources regarding Hugging Face tooling.☆48Updated last year
- ☆52Updated last year
- ☆26Updated 2 years ago
- Official code for the ICML 2024 paper "The Entropy Enigma: Success and Failure of Entropy Minimization"☆55Updated last year
- Aioli: A unified optimization framework for language model data mixing☆31Updated 10 months ago
- Implementation of a modular, high-performance, and simplistic mamba for high-speed applications☆38Updated last year
- ☆69Updated last year
- ☆33Updated 11 months ago
- The official evaluation suite and dynamic data release for MixEval.☆11Updated last year
- ☆56Updated last year
- Efficiently discovering algorithms via LLMs with evolutionary search and reinforcement learning.☆118Updated 3 weeks ago
- Supercharge huggingface transformers with model parallelism.☆77Updated 4 months ago
- Code for Arxiv Double Descent Demystified: Identifying, Interpreting & Ablating the Sources of a Deep Learning Puzzle☆61Updated 2 years ago