allenai / fm-cheatsheetLinks
Website for hosting the Open Foundation Models Cheat Sheet.
☆267Updated 2 months ago
Alternatives and similar repositories for fm-cheatsheet
Users that are interested in fm-cheatsheet are comparing it to the libraries listed below
Sorting:
- Manage scalable open LLM inference endpoints in Slurm clusters☆262Updated last year
- NeurIPS Large Language Model Efficiency Challenge: 1 LLM + 1GPU + 1Day☆256Updated last year
- A repository for research on medium sized language models.☆502Updated last month
- Let's build better datasets, together!☆260Updated 6 months ago
- Code for the paper "Rethinking Benchmark and Contamination for Language Models with Rephrased Samples"☆306Updated last year
- Scaling Data-Constrained Language Models☆337Updated 2 weeks ago
- batched loras☆343Updated last year
- RuLES: a benchmark for evaluating rule-following in language models☆227Updated 4 months ago
- The official evaluation suite and dynamic data release for MixEval.☆242Updated 8 months ago
- PyTorch building blocks for the OLMo ecosystem☆258Updated this week
- Multipack distributed sampler for fast padding-free training of LLMs☆194Updated 11 months ago
- ☆134Updated 3 months ago
- A puzzle to learn about prompting☆130Updated 2 years ago
- Extract full next-token probabilities via language model APIs☆247Updated last year
- ☆237Updated 3 months ago
- code for training & evaluating Contextual Document Embedding models☆194Updated last month
- ModuleFormer is a MoE-based architecture that includes two different types of experts: stick-breaking attention heads and feedforward exp…☆222Updated last year
- A comprehensive deep dive into the world of tokens☆224Updated last year
- ☆415Updated last year
- ☆523Updated 7 months ago
- Fast bare-bones BPE for modern tokenizer training☆159Updated 2 weeks ago
- BABILong is a benchmark for LLM evaluation using the needle-in-a-haystack approach.☆203Updated 2 months ago
- ☆259Updated this week
- Evaluation suite for LLMs☆352Updated 3 months ago
- ☆127Updated 3 months ago
- Code for NeurIPS'24 paper 'Grokked Transformers are Implicit Reasoners: A Mechanistic Journey to the Edge of Generalization'☆220Updated 7 months ago
- Toolkit for attaching, training, saving and loading of new heads for transformer models☆282Updated 4 months ago
- Pre-training code for Amber 7B LLM☆166Updated last year
- Official repository for "Scaling Retrieval-Based Langauge Models with a Trillion-Token Datastore".☆206Updated last month
- A bagel, with everything.☆322Updated last year