xuhaoxh / infini-gram-miniLinks
β38Updated 2 months ago
Alternatives and similar repositories for infini-gram-mini
Users that are interested in infini-gram-mini are comparing it to the libraries listed below
Sorting:
- DPO, but faster πβ46Updated last year
- Official Implementation of APB (ACL 2025 main Oral)β32Updated 9 months ago
- GoldFinch and other hybrid transformer componentsβ45Updated last year
- Code repository for the public reproduction of the language modelling experiments on "MatFormer: Nested Transformer for Elastic Inferenceβ¦β30Updated 2 years ago
- A repository for research on medium sized language models.β77Updated last year
- β39Updated 7 months ago
- Implementation of the paper: "Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention" from Google in pyTOβ¦β58Updated 2 weeks ago
- β60Updated 6 months ago
- An open source replication of the stawberry method that leverages Monte Carlo Search with PPO and or DPOβ29Updated last week
- Tiled Flash Linear Attention library for fast and efficient mLSTM Kernels.β79Updated 3 weeks ago
- Multi-Turn RL Training System with AgentTrainer for Language Model Game Reinforcement Learningβ54Updated last month
- https://x.com/BlinkDL_AI/status/1884768989743882276β28Updated 7 months ago
- A lightweight, user-friendly data-plane for LLM training.β37Updated 3 months ago
- JAX Scalify: end-to-end scaled arithmeticsβ17Updated last year
- The open-source materials for paper "Sparsing Law: Towards Large Language Models with Greater Activation Sparsity".β28Updated last year
- A byte-level decoder architecture that matches the performance of tokenized Transformers.β66Updated last year
- Beyond KV Caching: Shared Attention for Efficient LLMsβ20Updated last year
- The official repository for SkyLadder: Better and Faster Pretraining via Context Window Schedulingβ40Updated 2 months ago
- Official implementation of ECCV24 paper: POAβ24Updated last year
- imagetokenizer is a python package, helps you encoder visuals and generate visuals token ids from codebook, supports both image and videoβ¦β37Updated last year
- My Implementation of Q-Sparse: All Large Language Models can be Fully Sparsely-Activatedβ33Updated last year
- FlexAttention w/ FlashAttention3 Supportβ27Updated last year
- Repository for the Q-Filters method (https://arxiv.org/pdf/2503.02812)β35Updated 9 months ago
- β62Updated 2 weeks ago
- Fork of Flame repo for training of some new stuff in developmentβ19Updated 2 weeks ago
- AI-Driven Research Systems (ADRS)β91Updated this week
- Here we will test various linear attention designs.β62Updated last year
- vLLM adapter for a TGIS-compatible gRPC server.β45Updated this week
- Repository for Sparse Finetuning of LLMs via modified version of the MosaicML llmfoundryβ42Updated last year
- [ICML 2024] When Linear Attention Meets Autoregressive Decoding: Towards More Effective and Efficient Linearized Large Language Modelsβ35Updated last year