devvrit / matformerLinks
MatFormer repo
☆26Updated 5 months ago
Alternatives and similar repositories for matformer
Users that are interested in matformer are comparing it to the libraries listed below
Sorting:
- A repository for research on medium sized language models.☆76Updated last year
- Code for the examples presented in the talk "Training a Llama in your backyard: fine-tuning very large models on consumer hardware" given…☆14Updated last year
- https://x.com/BlinkDL_AI/status/1884768989743882276☆28Updated last month
- ☆49Updated 6 months ago
- This repo is based on https://github.com/jiaweizzhao/GaLore☆28Updated 8 months ago
- Triton Implementation of HyperAttention Algorithm☆48Updated last year
- Repo hosting codes and materials related to speeding LLMs' inference using token merging.☆36Updated last year
- JAX Scalify: end-to-end scaled arithmetics☆16Updated 7 months ago
- ☆68Updated 10 months ago
- ☆63Updated 8 months ago
- Train, tune, and infer Bamba model☆127Updated last month
- Pytorch/XLA SPMD Test code in Google TPU☆23Updated last year
- ☆58Updated 2 weeks ago
- My fork os allen AI's OLMo for educational purposes.☆30Updated 5 months ago
- Train a SmolLM-style llm on fineweb-edu in JAX/Flax with an assortment of optimizers.☆17Updated 2 months ago
- Collection of autoregressive model implementation☆85Updated last month
- Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment☆57Updated 9 months ago
- DPO, but faster 🚀☆42Updated 5 months ago
- Code, results and other artifacts from the paper introducing the WildChat-50m dataset and the Re-Wild model family.☆29Updated 2 months ago
- ☆44Updated last year
- Official repository for the paper "Approximating Two-Layer Feedforward Networks for Efficient Transformers"☆37Updated last year
- Load compute kernels from the Hub☆139Updated this week
- ☆34Updated 11 months ago
- GoldFinch and other hybrid transformer components☆45Updated 10 months ago
- Synthetic data generation and benchmark implementation for "Episodic Memories Generation and Evaluation Benchmark for Large Language Mode…☆45Updated last month
- An open source replication of the stawberry method that leverages Monte Carlo Search with PPO and or DPO☆29Updated last week
- ☆47Updated 9 months ago
- Lightweight toolkit package to train and fine-tune 1.58bit Language models☆69Updated 2 weeks ago
- Repository for the Q-Filters method (https://arxiv.org/pdf/2503.02812)☆31Updated 2 months ago
- ☆28Updated 4 months ago