allenai / OLMo-core
PyTorch building blocks for the OLMo ecosystem
β210Updated this week
Alternatives and similar repositories for OLMo-core
Users that are interested in OLMo-core are comparing it to the libraries listed below
Sorting:
- Reproducible, flexible LLM evaluationsβ200Updated this week
- πΎ OAT: A research-friendly framework for LLM online alignment, including preference learning, reinforcement learning, etc.β352Updated this week
- A project to improve skills of large language modelsβ367Updated this week
- Manage scalable open LLM inference endpoints in Slurm clustersβ256Updated 10 months ago
- Memory layers use a trainable key-value lookup mechanism to add extra parameters to a model without increasing FLOPs. Conceptually, sparsβ¦β323Updated 5 months ago
- β184Updated 3 months ago
- An extension of the nanoGPT repository for training small MOE models.β140Updated 2 months ago
- A simple unified framework for evaluating LLMsβ209Updated 3 weeks ago
- The HELMET Benchmarkβ143Updated 3 weeks ago
- Positional Skip-wise Training for Efficient Context Window Extension of LLMs to Extremely Length (ICLR 2024)β205Updated 11 months ago
- Code for NeurIPS'24 paper 'Grokked Transformers are Implicit Reasoners: A Mechanistic Journey to the Edge of Generalization'β191Updated 5 months ago
- Code for the paper "Rethinking Benchmark and Contamination for Language Models with Rephrased Samples"β301Updated last year
- Benchmarking LLMs with Challenging Tasks from Real Usersβ221Updated 6 months ago
- SkyRL-v0: Train Real-World Long-Horizon Agents via Reinforcement Learningβ180Updated this week
- Code for "LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding", ACL 2024β294Updated last week
- Official repository for "Scaling Retrieval-Based Langauge Models with a Trillion-Token Datastore".β199Updated last week
- A scalable asynchronous reinforcement learning implementation with in-flight weight updates.β105Updated this week
- The official evaluation suite and dynamic data release for MixEval.β239Updated 6 months ago
- code for training & evaluating Contextual Document Embedding modelsβ184Updated this week
- prime-rl is a codebase for decentralized RL training at scaleβ89Updated this week
- Tina: Tiny Reasoning Models via LoRAβ192Updated 3 weeks ago
- β111Updated 5 months ago
- LongRoPE is a novel method that can extends the context window of pre-trained LLMs to an impressive 2048k tokens.β227Updated 8 months ago
- Implementation of paper Data Engineering for Scaling Language Models to 128K Contextβ459Updated last year
- Cold Compress is a hackable, lightweight, and open-source toolkit for creating and benchmarking cache compression methods built on top ofβ¦β131Updated 9 months ago
- EvaByte: Efficient Byte-level Language Models at Scaleβ92Updated 3 weeks ago
- π Efficiently (pre)training foundation models with native PyTorch features, including FSDP for training and SDPA implementation of Flashβ¦β244Updated this week
- A framework to study AI models in Reasoning, Alignment, and use of Memory (RAM).β230Updated last week
- Benchmark and research code for the paper SWEET-RL Training Multi-Turn LLM Agents onCollaborative Reasoning Tasksβ188Updated last week
- Official repository for paper "ReasonIR Training Retrievers for Reasoning Tasks".β121Updated last week