llm-efficiency-challenge / neurips_llm_efficiency_challengeLinks

NeurIPS Large Language Model Efficiency Challenge: 1 LLM + 1GPU + 1Day

☆256

Alternatives and similar repositories for neurips_llm_efficiency_challenge

Users that are interested in neurips_llm_efficiency_challenge are comparing it to the libraries listed below

Sorting:

huggingface / datablations
Scaling Data-Constrained Language Models
☆342Updated 4 months ago
huggingface / llm-swarm
Manage scalable open LLM inference endpoints in Slurm clusters
☆276Updated last year
allenai / fm-cheatsheet
Website for hosting the Open Foundation Models Cheat Sheet.
☆267Updated 6 months ago
mlfoundations / open_lm
A repository for research on medium sized language models.
☆518Updated 5 months ago
normster / llm_rules
RuLES: a benchmark for evaluating rule-following in language models
☆238Updated 8 months ago
sabetAI / BLoRA
batched loras
☆347Updated 2 years ago
r-three / git-theta
git extension for {collaborative, communal, continual} model development
☆215Updated last year
lm-sys / llm-decontaminator
Code for the paper "Rethinking Benchmark and Contamination for Language Models with Rephrased Samples"
☆312Updated last year
xrsrke / pipegoose
Large scale 4D parallelism pre-training for 🤗 transformers in Mixture of Experts *(still work in progress)*
☆87Updated last year
lucidrains / CALM-pytorch
Implementation of CALM from the paper "LLM Augmented LLMs: Expanding Capabilities through Composition", out of Google Deepmind
☆177Updated last year
imoneoi / multipack
Multipack distributed sampler for fast padding-free training of LLMs
☆201Updated last year
p-lambda / dsir
DSIR large-scale data selection framework for language model training
☆265Updated last year
abacaj / train-with-fsdp
☆94Updated 2 years ago
HazyResearch / zoology
Understand and test language model architectures on synthetic tasks.
☆238Updated last month
gautierdag / bpeasy
Fast bare-bones BPE for modern tokenizer training
☆168Updated 4 months ago
srush / GPTWorld
A puzzle to learn about prompting
☆134Updated 2 years ago
tcapelle / llm_recipes
A set of scripts and notebooks on LLM finetunning and dataset creation
☆111Updated last year
IBM / ModuleFormer
ModuleFormer is a MoE-based architecture that includes two different types of experts: stick-breaking attention heads and feedforward exp…
☆224Updated last month
JinjieNi / MixEval
The official evaluation suite and dynamic data release for MixEval.
☆252Updated last year
srush / do-we-need-attention
☆166Updated 2 years ago
hydrallm / llama-moe-v1
☆95Updated 2 years ago
epfml / llm-baselines
nanoGPT-like codebase for LLM training
☆109Updated last week
srush / annotated-mamba
Annotated version of the Mamba paper
☆490Updated last year
llm-random / llm-random
☆204Updated 3 weeks ago
huggingface / picotron_tutorial
☆225Updated 3 weeks ago
booydar / babilong
BABILong is a benchmark for LLM evaluation using the needle-in-a-haystack approach.
☆215Updated 2 months ago
CASE-Lab-UMD / LLM-Drop
The official implementation of the paper "What Matters in Transformers? Not All Attention is Needed".
☆179Updated 7 months ago
llm-merging / LLM-Merging
LLM-Merging: Building LLMs Efficiently through Merging
☆205Updated last year
yuhuixu1993 / qa-lora
Official PyTorch implementation of QA-LoRA
☆143Updated last year
EleutherAI / nanoGPT-mup
The simplest, fastest repository for training/finetuning medium-sized GPTs.
☆172Updated 4 months ago