allenai / fm-cheatsheet
Website for hosting the Open Foundation Models Cheat Sheet.
☆267Updated 2 weeks ago
Alternatives and similar repositories for fm-cheatsheet:
Users that are interested in fm-cheatsheet are comparing it to the libraries listed below
- Manage scalable open LLM inference endpoints in Slurm clusters☆254Updated 9 months ago
- A repository for research on medium sized language models.☆495Updated last week
- ☆130Updated last month
- Code for the paper "Rethinking Benchmark and Contamination for Language Models with Rephrased Samples"☆301Updated last year
- NeurIPS Large Language Model Efficiency Challenge: 1 LLM + 1GPU + 1Day☆255Updated last year
- Multipack distributed sampler for fast padding-free training of LLMs☆188Updated 8 months ago
- Scaling Data-Constrained Language Models☆334Updated 7 months ago
- A puzzle to learn about prompting☆127Updated last year
- The official evaluation suite and dynamic data release for MixEval.☆238Updated 5 months ago
- ☆515Updated 5 months ago
- Extract full next-token probabilities via language model APIs☆242Updated last year
- RuLES: a benchmark for evaluating rule-following in language models☆223Updated 2 months ago
- Toolkit for attaching, training, saving and loading of new heads for transformer models☆276Updated 2 months ago
- Official repository for "Scaling Retrieval-Based Langauge Models with a Trillion-Token Datastore".☆196Updated last week
- ☆181Updated 2 months ago
- ModuleFormer is a MoE-based architecture that includes two different types of experts: stick-breaking attention heads and feedforward exp…☆220Updated last year
- ☆115Updated 3 weeks ago
- Understand and test language model architectures on synthetic tasks.☆194Updated last month
- A bagel, with everything.☆320Updated last year
- Create feature-centric and prompt-centric visualizations for sparse autoencoders (like those from Anthropic's published research).☆199Updated 4 months ago
- ☆186Updated this week
- Code for the paper "Fishing for Magikarp"☆155Updated last month
- ☆231Updated last month
- git extension for {collaborative, communal, continual} model development☆211Updated 5 months ago
- Legible, Scalable, Reproducible Foundation Models with Named Tensors and Jax☆569Updated this week
- Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients.☆198Updated 9 months ago
- Official repository for ORPO☆450Updated 11 months ago
- Code for NeurIPS'24 paper 'Grokked Transformers are Implicit Reasoners: A Mechanistic Journey to the Edge of Generalization'☆189Updated 5 months ago
- Implementation of paper Data Engineering for Scaling Language Models to 128K Context☆459Updated last year
- ☆92Updated last year