Code and training scripts for FlexOlmo
☆135Apr 7, 2026Updated last week
Alternatives and similar repositories for FlexOlmo
Users that are interested in FlexOlmo are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Gantry provides an API that streamlines running experiments in Beaker☆33Updated this week
- [COLM 2025] JailDAM: Jailbreak Detection with Adaptive Memory for Vision-Language Model☆27Nov 25, 2025Updated 4 months ago
- German Language Understanding Evaluation Benchmark @NAACL24☆22Dec 11, 2025Updated 4 months ago
- Data generation and training repository for SERA: Soft-Verified Efficient Repository Agents.☆138Mar 8, 2026Updated last month
- ☆13Jun 16, 2021Updated 4 years ago
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Reproducible, flexible LLM evaluations☆359Mar 24, 2026Updated 3 weeks ago
- Code for paper "Out-of-Domain Robustness via Targeted Augmentations"☆14Feb 25, 2023Updated 3 years ago
- Code for Evolving Language Models without Labels: Majority Drives Selection, Novelty Promotes Variation (EVOL-RL).☆48Mar 31, 2026Updated 2 weeks ago
- ☆77Apr 29, 2024Updated last year
- LOLA: Large and Open Source Multilingual Language Model☆11Updated this week
- [NeurIPS-2024] 📈 Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies https://arxiv.org/abs/2407.13623☆116Sep 26, 2024Updated last year
- c++ version of ViT☆12Nov 13, 2022Updated 3 years ago
- [COLM 2025] "C3PO: Critical-Layer, Core-Expert, Collaborative Pathway Optimization for Test-Time Expert Re-Mixing"☆20Apr 9, 2025Updated last year
- [COLM 2025] EvalTree: Profiling Language Model Weaknesses via Hierarchical Capability Trees☆31Jul 11, 2025Updated 9 months ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- [NeurIPS 2022] "A Win-win Deal: Towards Sparse and Robust Pre-trained Language Models", Yuanxin Liu, Fandong Meng, Zheng Lin, Jiangnan Li…☆21Jan 9, 2024Updated 2 years ago
- ☆70Updated this week
- Implementation Code for "LLM-based Medical Assistant Personalization with Short- and Long-Term Memory Coordination"☆14Apr 25, 2025Updated 11 months ago
- Fully open reproduction of DeepSeek-R1☆11Mar 24, 2025Updated last year
- ☆40Jan 26, 2025Updated last year
- Code for AAAI 2023 Paper : “Alignment-Enriched Tuning for Patch-Level Pre-trained Document Image Models”☆18Dec 6, 2022Updated 3 years ago
- Enhaced version of Wikiextrator: A wikipedia dumps extractor☆28Sep 17, 2025Updated 6 months ago
- Official implementation for DenseMixer: Improving MoE Post-Training with Precise Router Gradient☆66Aug 3, 2025Updated 8 months ago
- Research on the usage of Jupyter notebooks☆19Sep 12, 2019Updated 6 years ago
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- ☆34Jun 28, 2025Updated 9 months ago
- X-Reasoner: Towards Generalizable Reasoning Across Modalities and Domains☆49Feb 4, 2026Updated 2 months ago
- [AAAI 2026] AD-L-JEPA: Self-Supervised Representation Learning with Joint Embedding Predictive Architecture for Automotive LiDAR Object D…☆34Nov 18, 2025Updated 4 months ago
- ☆43Sep 15, 2025Updated 6 months ago
- Word acquisition in neural language models (TACL 2022).☆20Jan 30, 2025Updated last year
- SPEC-RL: Accelerating On-Policy Reinforcement Learning via Speculative Rollouts☆63Dec 1, 2025Updated 4 months ago
- ☆22Dec 18, 2024Updated last year
- [NeurIPS 2025] MergeBench: A Benchmark for Merging Domain-Specialized LLMs☆46Feb 11, 2026Updated 2 months ago
- SimKO: Simple Pass@K Policy Optimization☆28Oct 24, 2025Updated 5 months ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- [ICLR 2025] 🧬 RegMix: Data Mixture as Regression for Language Model Pre-training (Spotlight)☆189Feb 17, 2025Updated last year
- Official repository for FLAME-MoE: A Transparent End-to-End Research Platform for Mixture-of-Experts Language Models☆35Sep 19, 2025Updated 6 months ago
- Overview of corpora/datasets for Germanic low-resource languages and dialects. Accompanies "A Survey of Corpora for Germanic Low-Resource…☆27Feb 16, 2026Updated last month
- ☆17Jun 10, 2025Updated 10 months ago
- [Ebook]从零到百万店铺:一个没有计算机学位的普通人的系统设计实战之旅☆26Nov 11, 2025Updated 5 months ago
- KV Cache Steering for Inducing Reasoning in Small Language Models☆48Jul 24, 2025Updated 8 months ago
- Optimizing Anytime Reasoning via Budget Relative Policy Optimization☆54Jul 15, 2025Updated 8 months ago