☆221Feb 28, 2026Updated 3 weeks ago
Alternatives and similar repositories for llm-jepa
Users that are interested in llm-jepa are comparing it to the libraries listed below
Sorting:
- ☆99Mar 6, 2026Updated 2 weeks ago
- A fast, lightweight, and extensible RWKV chat UI powered by Flutter. Offline-ready, multi-backend support, ideal for local RWKV inference…☆83Updated this week
- Scaling In-context Learning from Few-shot to 1,024-shot on Tabular ML☆59Dec 12, 2025Updated 3 months ago
- GoldFinch and other hybrid transformer components☆45Jul 20, 2024Updated last year
- ☆11Oct 11, 2023Updated 2 years ago
- Code repository for "RL Grokking Recipe: How RL Unlocks and Transfers New Algorithms in LLMs""☆31Oct 12, 2025Updated 5 months ago
- ☆23Mar 7, 2025Updated last year
- Official implementation of ECCV24 paper: POA☆24Aug 8, 2024Updated last year
- Conformer block with Rotary Position Embedding, modified from lucidrains' implement☆18Sep 13, 2024Updated last year
- ☆177Apr 23, 2025Updated 10 months ago
- RePo: Language Models with Context Re-Positioning☆74Dec 24, 2025Updated 2 months ago
- Implementation of a Hierarchical Mamba as described in the paper: "Hierarchical State Space Models for Continuous Sequence-to-Sequence Mo…☆15Nov 11, 2024Updated last year
- My Implementation of Q-Sparse: All Large Language Models can be Fully Sparsely-Activated☆34Aug 14, 2024Updated last year
- Repository for ‘Anomaly Detection and Generation with Diffusion Models: A Survey’.☆35Jun 15, 2025Updated 9 months ago
- Hub for Open Source AGiXT Extensions, Chains, Prompts, and Agents.☆17Sep 27, 2023Updated 2 years ago
- Survey and Benchmark of Anomaly Detection in Business Processes☆18Jan 23, 2026Updated last month
- Custom triton kernels for training Karpathy's nanoGPT.☆19Oct 21, 2024Updated last year
- Code repository for the public reproduction of the language modelling experiments on "MatFormer: Nested Transformer for Elastic Inference…☆31Nov 14, 2023Updated 2 years ago
- AGiXT is a dynamic AI Automation Platform that seamlessly orchestrates instruction management and complex task execution across diverse A…☆23Jan 26, 2026Updated last month
- RWKV-7 mini☆12Mar 29, 2025Updated 11 months ago
- ☆11Jun 22, 2025Updated 8 months ago
- rl from zero pretrain, can it be done? yes.☆289Sep 28, 2025Updated 5 months ago
- Source code of ACL 2023 Main Conference Paper "PAD-Net: An Efficient Framework for Dynamic Networks".☆11Feb 28, 2026Updated 2 weeks ago
- hopefully I can continuously develop the project.☆29Dec 16, 2022Updated 3 years ago
- All information and news with respect to Falcon-H1 series☆108Oct 9, 2025Updated 5 months ago
- MLX Implementation of Recursive Reasoning with Tiny Networks☆79Oct 11, 2025Updated 5 months ago
- DISCO: Comprehensive and Explainable Disinformation Detection, CIKM 2022☆10May 5, 2023Updated 2 years ago
- RWKV-LM-V7(https://github.com/BlinkDL/RWKV-LM) Under Lightning Framework☆57Dec 24, 2025Updated 2 months ago
- Extending the Context of Pretrained LLMs by Dropping Their Positional Embedding☆210Jan 12, 2026Updated 2 months ago
- Scaling Long-Horizon LLM Agent via Context-Folding☆128Jan 26, 2026Updated last month
- ☆26Mar 5, 2023Updated 3 years ago
- ☆19Mar 31, 2024Updated last year
- [NeurIPS 2024] Official Repository of The Mamba in the Llama: Distilling and Accelerating Hybrid Models☆238Oct 14, 2025Updated 5 months ago
- [CVPR 2025] Parallel Sequence Modeling via Generalized Spatial Propagation Network☆111Jul 18, 2025Updated 8 months ago
- WuBu Nesting Playground, Inspired by XJDR Entropy, Now Hyperbolic Math Focused☆26Mar 9, 2026Updated last week
- REAP expert pruning for MoE LLMs on Apple Silicon via MLX☆45Updated this week
- ☆59Mar 2, 2026Updated 2 weeks ago
- ☆10Oct 18, 2021Updated 4 years ago
- virtual node analysis on ogb benchmark dataset☆14Mar 9, 2023Updated 3 years ago