allenai / OLMo-core
PyTorch building blocks for the OLMo ecosystem
β165Updated this week
Alternatives and similar repositories for OLMo-core:
Users that are interested in OLMo-core are comparing it to the libraries listed below
- πΎ OAT: A research-friendly framework for LLM online alignment, including preference learning, reinforcement learning, etc.β224Updated 2 weeks ago
- β158Updated last month
- Reproducible, flexible LLM evaluationsβ176Updated 3 months ago
- Manage scalable open LLM inference endpoints in Slurm clustersβ253Updated 8 months ago
- A framework to study AI models in Reasoning, Alignment, and use of Memory (RAM).β209Updated this week
- Memory layers use a trainable key-value lookup mechanism to add extra parameters to a model without increasing FLOPs. Conceptually, sparsβ¦β307Updated 3 months ago
- Positional Skip-wise Training for Efficient Context Window Extension of LLMs to Extremely Length (ICLR 2024)β206Updated 10 months ago
- The official evaluation suite and dynamic data release for MixEval.β233Updated 4 months ago
- Repo for "LoLCATs: On Low-Rank Linearizing of Large Language Models"β223Updated last month
- Official repository for "Scaling Retrieval-Based Langauge Models with a Trillion-Token Datastore".β196Updated this week
- β136Updated 4 months ago
- Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients.β196Updated 8 months ago
- π Efficiently (pre)training foundation models with native PyTorch features, including FSDP for training and SDPA implementation of Flashβ¦β232Updated 2 weeks ago
- β98Updated 3 months ago
- A pipeline for LLM knowledge distillationβ98Updated last month
- Implementation of the LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens Paperβ128Updated 8 months ago
- Layer-Condensed KV cache w/ 10 times larger batch size, fewer params and less computation. Dramatic speed up with better task performanceβ¦β148Updated 2 months ago
- Benchmarking LLMs with Challenging Tasks from Real Usersβ218Updated 4 months ago
- Tree Attention: Topology-aware Decoding for Long-Context Attention on GPU clustersβ125Updated 3 months ago
- A project to improve skills of large language modelsβ256Updated this week
- π’ Data Toolkit for Sailor Language Modelsβ87Updated last month
- This is the official repository for Inheritune.β109Updated last month
- code for training & evaluating Contextual Document Embedding modelsβ176Updated 2 months ago
- Archon provides a modular framework for combining different inference-time techniques and LMs with just a JSON config file.β164Updated 2 weeks ago
- OpenCoconut implements a latent reasoning paradigm where we generate thoughts before decoding.β168Updated 2 months ago
- Multipack distributed sampler for fast padding-free training of LLMsβ186Updated 7 months ago
- The simplest, fastest repository for training/finetuning medium-sized GPTs.β100Updated 4 months ago