MatFormer repo
☆75Dec 9, 2024Updated last year
Alternatives and similar repositories for matformer
Users that are interested in matformer are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- The official implementation of HybridNorm: Towards Stable and Efficient Transformer Training via Hybrid Normalization☆19Mar 7, 2025Updated last year
- Extending the Context of Pretrained LLMs by Dropping Their Positional Embedding☆219Jan 12, 2026Updated 5 months ago
- Use the tokenizer in parallel to achieve superior acceleration☆20Mar 21, 2024Updated 2 years ago
- Repository of PIXAR, a Pixel-based Auto-Regressive Language Model☆20Sep 15, 2025Updated 9 months ago
- Official repo for BWLer: Barycentric Weight Layer☆30Mar 20, 2026Updated 3 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Implementation of the dilated self attention as described in "LongNet: Scaling Transformers to 1,000,000,000 Tokens"☆13Jul 23, 2023Updated 2 years ago
- Code repository for the public reproduction of the language modelling experiments on "MatFormer: Nested Transformer for Elastic Inference…☆31Nov 14, 2023Updated 2 years ago
- ☆50May 20, 2025Updated last year
- A RAG that can scale 🧑🏻💻☆11May 28, 2024Updated 2 years ago
- Code for "What really matters in matrix-whitening optimizers?"☆24Oct 31, 2025Updated 8 months ago
- NeMo: a toolkit for conversational AI☆13May 4, 2024Updated 2 years ago
- Vocabulary Trimming (VT) is a model compression technique, which reduces a multilingual LM vocabulary to a target language by deleting ir…☆67Oct 25, 2024Updated last year
- [ICML2025] Official code for "Reinforced Lifelong Editing for Language Models"☆23Feb 23, 2025Updated last year
- ☆10Oct 2, 2024Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Source code for the paper "Positional Attention: Expressivity and Learnability of Algorithmic Computation"☆14May 26, 2025Updated last year
- ☆38Jan 26, 2024Updated 2 years ago
- ☆55Sep 26, 2025Updated 9 months ago
- Fast search index for SPLADE sparse retrieval models implemented in Python using Numpy and Numba☆38Oct 16, 2025Updated 8 months ago
- YASEM - Yet Another Splade|Sparse Embedder - A simple and efficient library for SPLADE embeddings☆13May 22, 2025Updated last year
- Paper dataset for "Factored Verification: Detecting and Reducing Hallucination in Summaries of Academic Papers"☆13Oct 20, 2024Updated last year
- Official Repository for paper "Ontology-Free General-Domain Knowledge Graph-to-Text Generation Dataset Synthesis using Large Language Mod…☆15Nov 25, 2024Updated last year
- Run TFLITE models on the web☆13Jan 2, 2022Updated 4 years ago
- Official repo of dataset-decomposition paper [NeurIPS 2024]☆22Jan 8, 2025Updated last year
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- ☆17Jan 5, 2023Updated 3 years ago
- [ICLR'25] Code for KaSA, an official implementation of "KaSA: Knowledge-Aware Singular-Value Adaptation of Large Language Models"☆22Jan 16, 2025Updated last year
- Python library to use Pleias-RAG models☆72Jun 20, 2026Updated 2 weeks ago
- Landing repository for the paper "Predicting the Order of Upcoming Tokens Improves Language Modeling"☆46May 13, 2026Updated last month
- Official Repository for "Hypencoder: Hypernetworks for Information Retrieval"☆40Sep 20, 2025Updated 9 months ago
- A summarizer for Japanese articles (but ChatGPT is better)☆10Aug 1, 2022Updated 3 years ago
- PyTorch implementation of StableMask (ICML'24)☆15Jun 27, 2024Updated 2 years ago
- ☆139Jun 6, 2025Updated last year
- ☆93Aug 18, 2024Updated last year
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Transformers components but in Triton☆34May 9, 2025Updated last year
- RADLADS training code☆44May 7, 2025Updated last year
- Implementation of Influence Function approximations for differently sized ML models, using PyTorch☆18Sep 15, 2023Updated 2 years ago
- ☆13Nov 15, 2021Updated 4 years ago
- UQ: Assessing Language Models on Unsolved Questions☆30Aug 26, 2025Updated 10 months ago
- 🎨 Imagine what Picasso could have done with AI. Self-host your StableDiffusion API.☆50May 8, 2023Updated 3 years ago
- 🤗 HuggingFace Inference Toolkit for Google Cloud Vertex AI (similar to SageMaker's Inference Toolkit, but for Vertex AI and unofficial)☆17Mar 20, 2024Updated 2 years ago