fabienfrfr / tpttLinks
π TPTT: Transforming Pretrained Transformers into Titans
β57Updated 2 months ago
Alternatives and similar repositories for tptt
Users that are interested in tptt are comparing it to the libraries listed below
Sorting:
- Resa: Transparent Reasoning Models via SAEsβ47Updated 4 months ago
- The official repo for βUnleashing the Reasoning Potential of Pre-trained LLMs by Critique Fine-Tuning on One Problemβ [EMNLP25]β33Updated 5 months ago
- Official repo of paper LM2β46Updated 11 months ago
- A repository for research on medium sized language models.β77Updated last year
- β63Updated 7 months ago
- β100Updated 5 months ago
- RWKV is an RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the bestβ¦β59Updated 10 months ago
- β40Updated 9 months ago
- The official repository for SkyLadder: Better and Faster Pretraining via Context Window Schedulingβ42Updated last month
- An official implementation of Random Policy Valuation is Enough for LLM Reasoning with Verifiable Rewardsβ35Updated 4 months ago
- XVERSE-MoE-A36B: A multilingual large language model developed by XVERSE Technology Inc.β38Updated last year
- DPO, but faster πβ47Updated last year
- Code for Paper: Harnessing Webpage Uis For Text Rich Visual Understandingβ53Updated last year
- Beyond Real: Imaginary Extension of Rotary Position Embeddings for Long-Context LLMsβ33Updated last month
- Official repository for ICML 2024 paper "MoRe Fine-Tuning with 10x Fewer Parameters"β22Updated 3 months ago
- Repository for the Q-Filters method (https://arxiv.org/pdf/2503.02812)β35Updated 10 months ago
- [ICLR 2026] RPG: KL-Regularized Policy Gradient (https://arxiv.org/abs/2505.17508)β65Updated last week
- The open-source code of MetaStone-S1.β105Updated 6 months ago
- Official Implementation of APB (ACL 2025 main Oral) and Spava.β32Updated this week
- Lottery Ticket Adaptationβ39Updated last year
- β100Updated this week
- [TMLR 2026] When Attention Collapses: How Degenerate Layers in LLMs Enable Smaller, Stronger Modelsβ121Updated 11 months ago
- Official repository for "BLEUBERI: BLEU is a surprisingly effective reward for instruction following"β31Updated 8 months ago
- Official Repository for Task-Circuit Quantizationβ24Updated 8 months ago
- [ACL2025 Findings] Benchmarking Multihop Multimodal Internet Agentsβ48Updated 11 months ago
- Implementation of Mind Evolution, Evolving Deeper LLM Thinking, from Deepmindβ59Updated 8 months ago
- π LLM-I: Transform LLMs into natural interleaved multimodal creators! β¨ Tool-use framework supporting image search, generation, code exβ¦β40Updated 3 months ago
- Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignmentβ61Updated last year
- β67Updated 10 months ago
- β34Updated last year