facebookresearch / palLinks
PAL: Predictive Analysis & Laws of Large Language Models
☆37Updated 8 months ago
Alternatives and similar repositories for pal
Users that are interested in pal are comparing it to the libraries listed below
Sorting:
- ☆77Updated last month
- Recycling diverse models☆45Updated 2 years ago
- Implementation of the general framework for AMIE, from the paper "Towards Conversational Diagnostic AI", out of Google Deepmind☆68Updated last year
- Repository containing awesome resources regarding Hugging Face tooling.☆48Updated last year
- PyTorch implementation of Soft MoE by Google Brain in "From Sparse to Soft Mixtures of Experts" (https://arxiv.org/pdf/2308.00951.pdf)☆78Updated last year
- ICLR 2025 - official implementation for "I-Con: A Unifying Framework for Representation Learning"☆112Updated 2 months ago
- Model Stock: All we need is just a few fine-tuned models☆123Updated last month
- Train, tune, and infer Bamba model☆132Updated 3 months ago
- ☆51Updated last year
- Towards Understanding the Mixture-of-Experts Layer in Deep Learning☆31Updated last year
- Official implementation of MAIA, A Multimodal Automated Interpretability Agent☆89Updated 3 months ago
- A regression-alike loss to improve numerical reasoning in language models - ICML 2025☆25Updated last month
- ☆59Updated last year
- Tree prompting: easy-to-use scikit-learn interface for improved prompting.☆41Updated last year
- Implementation of Mind Evolution, Evolving Deeper LLM Thinking, from Deepmind☆56Updated 3 months ago
- ML/DL Math and Method notes☆63Updated last year
- Codebase the paper "The Remarkable Robustness of LLMs: Stages of Inference?"☆18Updated 3 months ago
- Code and pretrained models for the paper: "MatMamba: A Matryoshka State Space Model"☆61Updated 10 months ago
- Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment☆60Updated last year
- Implementation of a modular, high-performance, and simplistic mamba for high-speed applications☆36Updated 10 months ago
- Generating and validating natural-language explanations for the brain.☆57Updated last week
- Can GPT-4 Perform Neural Architecture Search?☆87Updated 2 years ago
- We study toy models of skill learning.☆31Updated 8 months ago
- The code implementation of MAGDi: Structured Distillation of Multi-Agent Interaction Graphs Improves Reasoning in Smaller Language Models…☆36Updated last year
- PyTorch Implementation of the paper "MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training"☆24Updated this week
- ☆28Updated 2 months ago
- KV Cache Steering for Inducing Reasoning in Small Language Models☆39Updated last month
- Code for "Accelerating Training with Neuron Interaction and Nowcasting Networks" [to appear at ICLR 2025]☆20Updated 3 months ago
- Official repo of Progressive Data Expansion: data, code and evaluation☆29Updated last year
- ☆64Updated last month