thepowerfuldeez / OLMo
My fork os allen AI's OLMo for educational purposes.
☆30Updated last month
Alternatives and similar repositories for OLMo:
Users that are interested in OLMo are comparing it to the libraries listed below
- Official repository for the paper "SwitchHead: Accelerating Transformers with Mixture-of-Experts Attention"☆96Updated 4 months ago
- ☆70Updated 5 months ago
- A repository for research on medium sized language models.☆76Updated 8 months ago
- This is the official repository for Inheritune.☆109Updated 3 months ago
- From GaLore to WeLore: How Low-Rank Weights Non-uniformly Emerge from Low-Rank Gradients. Ajay Jaiswal, Lu Yin, Zhenyu Zhang, Shiwei Liu,…☆42Updated 6 months ago
- Parameter-Efficient Sparsity Crafting From Dense to Mixture-of-Experts for Instruction Tuning on General Tasks☆139Updated 4 months ago
- Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment☆53Updated 5 months ago
- Collection of autoregressive model implementation☆77Updated 3 weeks ago
- ☆47Updated 5 months ago
- ☆72Updated 2 weeks ago
- Block Transformer: Global-to-Local Language Modeling for Fast Inference (NeurIPS 2024)☆149Updated last month
- FuseAI Project☆80Updated this week
- ☆98Updated last week
- Implementation of the paper: "Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention" from Google in pyTO…☆53Updated this week
- ☆48Updated 2 months ago
- Repo hosting codes and materials related to speeding LLMs' inference using token merging.☆34Updated 9 months ago
- Data preparation code for CrystalCoder 7B LLM☆44Updated 8 months ago
- Train, tune, and infer Bamba model☆80Updated 2 weeks ago
- A byte-level decoder architecture that matches the performance of tokenized Transformers.☆65Updated 9 months ago
- A single repo with all scripts and utils to train / fine-tune the Mamba model with or without FIM☆50Updated 9 months ago
- The official repo for "LLoCo: Learning Long Contexts Offline"☆114Updated 7 months ago
- ☆57Updated this week
- ☆44Updated 7 months ago
- ☆38Updated 11 months ago
- Code for Paper: Harnessing Webpage Uis For Text Rich Visual Understanding☆45Updated last month
- Minimal implementation of the Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models paper (ArXiv 20232401.01335)☆29Updated 10 months ago
- The official code repo and data hub of top_nsigma sampling strategy for LLMs.☆20Updated last week
- A toolkit for fine-tuning, inferencing, and evaluating GreenBitAI's LLMs.☆80Updated last week
- A pipeline for LLM knowledge distillation☆85Updated this week