facebookresearch / palLinks
PAL: Predictive Analysis & Laws of Large Language Models
☆38Updated 10 months ago
Alternatives and similar repositories for pal
Users that are interested in pal are comparing it to the libraries listed below
Sorting:
- ☆78Updated 3 months ago
- Towards Understanding the Mixture-of-Experts Layer in Deep Learning☆32Updated last year
- ICLR 2025 - official implementation for "I-Con: A Unifying Framework for Representation Learning"☆117Updated 4 months ago
- Implementation of the general framework for AMIE, from the paper "Towards Conversational Diagnostic AI", out of Google Deepmind☆68Updated last year
- Repository containing awesome resources regarding Hugging Face tooling.☆48Updated last year
- Train, tune, and infer Bamba model☆136Updated 5 months ago
- ☆59Updated last year
- Codebase the paper "The Remarkable Robustness of LLMs: Stages of Inference?"☆19Updated 5 months ago
- Implementation of a modular, high-performance, and simplistic mamba for high-speed applications☆37Updated last year
- KV Cache Steering for Inducing Reasoning in Small Language Models☆42Updated 3 months ago
- PyTorch implementation of Soft MoE by Google Brain in "From Sparse to Soft Mixtures of Experts" (https://arxiv.org/pdf/2308.00951.pdf)☆78Updated 2 years ago
- Implementation of 🌻 Mirasol, SOTA Multimodal Autoregressive model out of Google Deepmind, in Pytorch☆90Updated last year
- Aioli: A unified optimization framework for language model data mixing☆28Updated 10 months ago
- Minimum Description Length probing for neural network representations☆20Updated 9 months ago
- Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment☆60Updated last year
- Supercharge huggingface transformers with model parallelism.☆77Updated 3 months ago
- The code implementation of MAGDi: Structured Distillation of Multi-Agent Interaction Graphs Improves Reasoning in Smaller Language Models…☆37Updated last year
- Implementation of Mind Evolution, Evolving Deeper LLM Thinking, from Deepmind☆57Updated 5 months ago
- Code and pretrained models for the paper: "MatMamba: A Matryoshka State Space Model"☆61Updated last year
- Understanding how features learned by neural networks evolve throughout training☆39Updated last year
- A regression-alike loss to improve numerical reasoning in language models - ICML 2025☆26Updated 3 months ago
- Library for the Test-based Calibration Error (TCE) metric to quantify the degree to classifier calibration.☆13Updated 2 years ago
- Recycling diverse models☆46Updated 2 years ago
- ML/DL Math and Method notes☆64Updated last year
- OLMost every training recipe you need to perform data interventions with the OLMo family of models.☆52Updated this week
- Model Stock: All we need is just a few fine-tuned models☆127Updated 3 months ago
- One Initialization to Rule them All: Fine-tuning via Explained Variance Adaptation☆45Updated last month
- ☆80Updated last year
- Timm model explorer☆42Updated last year
- The official repository for HyperZ⋅Z⋅W Operator Connects Slow-Fast Networks for Full Context Interaction.☆42Updated 7 months ago