OpenEvaByte / evabyteLinks

EvaByte: Efficient Byte-level Language Models at Scale

☆111

Alternatives and similar repositories for evabyte

Users that are interested in evabyte are comparing it to the libraries listed below

Sorting:

casper-hansen / OpenCoconut
OpenCoconut implements a latent reasoning paradigm where we generate thoughts before decoding.
☆173Updated 10 months ago
SalesforceAIResearch / LaTRO
☆124Updated 9 months ago
JoeLi12345 / nGPT
an open source reproduction of NVIDIA's nGPT (Normalized Transformer with Representation Learning on the Hypersphere)
☆108Updated 8 months ago
RobertCsordas / moeut
☆89Updated last year
HazyResearch / lolcats
Repo for "LoLCATs: On Low-Rank Linearizing of Large Language Models"
☆249Updated 10 months ago
BlinkDL / modded-nanogpt-rwkv
RWKV-7: Surpassing GPT
☆101Updated last year
Zyphra / Zamba2
PyTorch implementation of models from the Zamba2 series.
☆186Updated 10 months ago
Zyphra / tree_attention
Tree Attention: Topology-aware Decoding for Long-Context Attention on GPU clusters
☆130Updated last year
microsoft / ArchScale
Simple & Scalable Pretraining for Neural Architecture Research
☆302Updated last month
sebulo / LoQT
☆81Updated last year
TRI-ML / linear_open_lm
A repository for research on medium sized language models.
☆78Updated last year
HazyResearch / cartridges
Storing long contexts in tiny caches with self-study
☆218Updated last month
OSU-NLP-Group / GrokkedTransformer
Code for NeurIPS'24 paper 'Grokked Transformers are Implicit Reasoners: A Mechanistic Journey to the Edge of Generalization'
☆234Updated 4 months ago
kanishkg / stream-of-search
Repository for the paper Stream of Search: Learning to Search in Language
☆151Updated 10 months ago
vicksEmmanuel / latent-gemma
☆26Updated 10 months ago
wdlctc / mini-s
☆53Updated last year
sanyalsunny111 / LLM-Inheritune
This is the official repository for Inheritune.
☆115Updated 9 months ago
ScalingIntelligence / Archon
Archon provides a modular framework for combining different inference-time techniques and LMs with just a JSON config file.
☆189Updated 8 months ago
ScalingIntelligence / large_language_monkeys
☆109Updated last year
Alex-Gurung / ReasoningNCP
Official repo for Learning to Reason for Long-Form Story Generation
☆72Updated 7 months ago
OpenPipe / deductive-reasoning
Train your own SOTA deductive reasoning model
☆107Updated 8 months ago
LucasPrietoAl / grokking-at-the-edge-of-numerical-stability
☆105Updated 4 months ago
athms / mad-lab
A MAD laboratory to improve AI architecture designs 🧪
☆135Updated 11 months ago
JacobPfau / fillerTokens
☆75Updated last year
apple / ml-planner
☆56Updated last year
allenai / IFBench
☆92Updated last week
lucidrains / coconut-pytorch
Implementation of 🥥 Coconut, Chain of Continuous Thought, in Pytorch
☆180Updated 5 months ago
siyan-zhao / prepacking
The source code of our work "Prepacking: A Simple Method for Fast Prefilling and Increased Throughput in Large Language Models" [AISTATS …
☆60Updated last year
EleutherAI / nanoGPT-mup
The simplest, fastest repository for training/finetuning medium-sized GPTs.
☆174Updated 5 months ago
NVIDIA / ngpt
Normalized Transformer (nGPT)
☆194Updated last year