facebookresearch / palLinks
PAL: Predictive Analysis & Laws of Large Language Models
☆36Updated 7 months ago
Alternatives and similar repositories for pal
Users that are interested in pal are comparing it to the libraries listed below
Sorting:
- Recycling diverse models☆45Updated 2 years ago
- ☆75Updated 3 months ago
- Towards Understanding the Mixture-of-Experts Layer in Deep Learning☆31Updated last year
- ICLR 2025 - official implementation for "I-Con: A Unifying Framework for Representation Learning"☆109Updated last month
- PyTorch implementation of Soft MoE by Google Brain in "From Sparse to Soft Mixtures of Experts" (https://arxiv.org/pdf/2308.00951.pdf)☆76Updated last year
- Implementation of the general framework for AMIE, from the paper "Towards Conversational Diagnostic AI", out of Google Deepmind☆66Updated 10 months ago
- Implementation of 🌻 Mirasol, SOTA Multimodal Autoregressive model out of Google Deepmind, in Pytorch☆89Updated last year
- Repository containing awesome resources regarding Hugging Face tooling.☆48Updated last year
- ☆59Updated last year
- A regression-alike loss to improve numerical reasoning in language models☆24Updated 3 weeks ago
- Implementation of a modular, high-performance, and simplistic mamba for high-speed applications☆36Updated 9 months ago
- Implementation of Mind Evolution, Evolving Deeper LLM Thinking, from Deepmind☆56Updated 2 months ago
- Code and pretrained models for the paper: "MatMamba: A Matryoshka State Space Model"☆60Updated 8 months ago
- The official repository for HyperZ⋅Z⋅W Operator Connects Slow-Fast Networks for Full Context Interaction.☆38Updated 4 months ago
- Implementation of Infini-Transformer in Pytorch☆110Updated 7 months ago
- Supercharge huggingface transformers with model parallelism.☆77Updated 2 weeks ago
- ML/DL Math and Method notes☆63Updated last year
- ☆226Updated this week
- Codebase the paper "The Remarkable Robustness of LLMs: Stages of Inference?"☆18Updated 2 months ago
- Implementation of MambaFormer in Pytorch ++ Zeta from the paper: "Can Mamba Learn How to Learn? A Comparative Study on In-Context Learnin…☆21Updated 2 weeks ago
- Train, tune, and infer Bamba model☆131Updated 2 months ago
- Interactive coding assistant for data scientists and machine learning developers, empowered by large language models.☆95Updated 10 months ago
- ☆50Updated last year
- Explorations into the recently proposed Taylor Series Linear Attention☆100Updated 11 months ago
- Code for "Accelerating Training with Neuron Interaction and Nowcasting Networks" [to appear at ICLR 2025]☆19Updated 2 months ago
- Code for PHATGOOSE introduced in "Learning to Route Among Specialized Experts for Zero-Shot Generalization"☆86Updated last year
- Model Stock: All we need is just a few fine-tuned models☆121Updated 10 months ago
- Dedicated to building industrial foundation models for universal data intelligence across industries.☆57Updated 11 months ago
- Utilities for Training Very Large Models☆58Updated 10 months ago
- Some personal experiments around routing tokens to different autoregressive attention, akin to mixture-of-experts☆120Updated 9 months ago