thepowerfuldeez / OLMo
My fork os allen AI's OLMo for educational purposes.
☆28Updated this week
Related projects ⓘ
Alternatives and complementary repositories for OLMo
- Parameter-Efficient Sparsity Crafting From Dense to Mixture-of-Experts for Instruction Tuning on General Tasks☆129Updated 2 months ago
- Official repository for the paper "SwitchHead: Accelerating Transformers with Mixture-of-Experts Attention"☆92Updated last month
- This is the official repository for Inheritune.☆105Updated last month
- ☆45Updated 2 months ago
- A single repo with all scripts and utils to train / fine-tune the Mamba model with or without FIM☆50Updated 7 months ago
- A repository for research on medium sized language models.☆74Updated 5 months ago
- ☆40Updated 2 weeks ago
- ☆62Updated 3 months ago
- ☆63Updated last month
- Set of scripts to finetune LLMs☆36Updated 7 months ago
- From GaLore to WeLore: How Low-Rank Weights Non-uniformly Emerge from Low-Rank Gradients. Ajay Jaiswal, Lu Yin, Zhenyu Zhang, Shiwei Liu,…☆43Updated 4 months ago
- A pipeline for LLM knowledge distillation☆78Updated 3 months ago
- ☆53Updated 5 months ago
- ☆48Updated last month
- Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment☆46Updated 2 months ago
- ☆102Updated last month
- Collection of autoregressive model implementation☆67Updated this week
- Code for Paper: Harnessing Webpage Uis For Text Rich Visual Understanding☆38Updated last month
- The official repo for "LLoCo: Learning Long Contexts Offline"☆113Updated 5 months ago
- Block Transformer: Global-to-Local Language Modeling for Fast Inference (Official Code)☆135Updated last month
- Official implementation for 'Extending LLMs’ Context Window with 100 Samples'☆74Updated 10 months ago
- ☆37Updated 5 months ago
- Minimal implementation of the Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models paper (ArXiv 20232401.01335)☆28Updated 8 months ago
- Repo hosting codes and materials related to speeding LLMs' inference using token merging.☆29Updated 6 months ago
- Data preparation code for Amber 7B LLM☆82Updated 6 months ago
- Code for PHATGOOSE introduced in "Learning to Route Among Specialized Experts for Zero-Shot Generalization"☆78Updated 8 months ago
- ☆46Updated last week
- ☆87Updated 9 months ago
- A toolkit for fine-tuning, inferencing, and evaluating GreenBitAI's LLMs.☆74Updated last month
- The first dense retrieval model that can be prompted like an LM☆63Updated 2 months ago