CarperAI / OpenELMLinks

Evolution Through Large Models

☆735

Alternatives and similar repositories for OpenELM

Users that are interested in OpenELM are comparing it to the libraries listed below

Sorting:

SkunkworksAI / hydra-moe
☆415Updated 2 years ago
ezelikman / parsel
Code for Parsel 🐍 - generate complex programs with language models
☆433Updated 2 years ago
HazyResearch / H3
Language Modeling with the H3 State Space Model
☆519Updated 2 years ago
google-deepmind / tracr
☆548Updated last year
CarperAI / cheese
Used for adaptive human in the loop evaluation of language and embedding models.
☆308Updated 2 years ago
persimmon-ai-labs / adept-inference
Inference code for Persimmon-8B
☆412Updated 2 years ago
noahshinn / reflexion-draft
Reflexion: an autonomous agent with dynamic memory and self-reflection
☆388Updated 2 years ago
openai / automated-interpretability
☆1,057Updated last year
rgreenblatt / arc_draw_more_samples_pub
Draw more samples
☆196Updated last year
sanjeevanahilan / nanoChatGPT
A crude RLHF layer on top of nanoGPT with Gumbel-Softmax trick
☆293Updated 2 years ago
mlfoundations / open_lm
A repository for research on medium sized language models.
☆520Updated 5 months ago
HazyResearch / safari
Convolutions for Sequence Modeling
☆904Updated last year
alasdairforsythe / tokenmonster
Ungreedy subword tokenizer and vocabulary trainer for Python, Go & Javascript
☆606Updated last year
HazyResearch / m2
Repo for "Monarch Mixer: A Simple Sub-Quadratic GEMM-Based Architecture"
☆561Updated 11 months ago
salesforce / jaxformer
Minimal library to train LLMs on TPU in JAX with pjit().
☆299Updated last year
yuchenlin / LLM-Blender
[ACL2023] We introduce LLM-Blender, an innovative ensembling framework to attain consistently superior performance by leveraging the dive…
☆971Updated last year
kuleshov-group / llmtools
Finetuning Large Language Models on One Consumer GPU in 2 Bits
☆733Updated last year
tysam-code / hlb-gpt
Minimalistic, extremely fast, and hackable researcher's toolbench for GPT models in 307 lines of code. Reaches <3.8 validation loss on wi…
☆352Updated last year
salesforce / CodeRL
This is the official code for the paper CodeRL: Mastering Code Generation through Pretrained Models and Deep Reinforcement Learning (Neur…
☆556Updated 10 months ago
anthropics / evals
☆313Updated last year
OpenLemur / Lemur
[ICLR 2024] Lemur: Open Foundation Models for Language Agents
☆555Updated 2 years ago
HazyResearch / ama_prompting
Ask Me Anything language model prompting
☆546Updated 2 years ago
abertsch72 / unlimiformer
Public repo for the NeurIPS 2023 paper "Unlimiformer: Long-Range Transformers with Unlimited Length Input"
☆1,063Updated last year
AlignmentResearch / tuned-lens
Tools for understanding how transformer predictions are built layer-by-layer
☆549Updated 3 months ago
lucidrains / memorizing-transformers-pytorch
Implementation of Memorizing Transformers (ICLR 2022), attention net augmented with indexing and retrieval of memories using approximate …
☆637Updated 2 years ago
tomaarsen / attention_sinks
Extend existing LLMs way beyond the original training length with constant memory usage, without retraining
☆732Updated last year
tatsu-lab / alpaca_farm
A simulation framework for RLHF and alternatives. Develop your RLHF method without collecting human data.
☆837Updated last year
booydar / recurrent-memory-transformer
[NeurIPS 22] [AAAI 24] Recurrent Transformer-based long-context architecture.
☆775Updated last year
PiotrNawrot / nanoT5
Fast & Simple repository for pre-training and fine-tuning T5-style models
☆1,014Updated last year
ShengranHu / Thought-Cloning
[NeurIPS '23 Spotlight] Thought Cloning: Learning to Think while Acting by Imitating Human Thinking
☆269Updated last year