☆242Feb 28, 2026Updated last month
Alternatives and similar repositories for llm-jepa
Users that are interested in llm-jepa are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆130Mar 6, 2026Updated last month
- A fast, lightweight, and extensible RWKV chat UI powered by Flutter. Offline-ready, multi-backend support, ideal for local RWKV inference…☆87Apr 7, 2026Updated last week
- ☆14Dec 12, 2024Updated last year
- Standalone repo for our Atropos integration with Thinking Machines Tinker API (https://thinkingmachines.ai/tinker/)☆44Mar 22, 2026Updated 3 weeks ago
- Experiments Notebook of "Understanding the Skill Gap in Recurrent Language Models: The Role of the Gather-and-Aggregate Mechanism"☆15Apr 30, 2025Updated 11 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Scaling In-context Learning from Few-shot to 1,024-shot on Tabular ML☆59Dec 12, 2025Updated 4 months ago
- Explorations into the proposed SDFT, Self-Distillation Enables Continual Learning, from Shenfeld et al. of MIT☆31Feb 6, 2026Updated 2 months ago
- The repository contains code for Adaptive Data Optimization☆35Dec 9, 2024Updated last year
- JoVA: Unified Multimodal Learning for Joint Video-Audio Generation☆29Dec 22, 2025Updated 3 months ago
- GoldFinch and other hybrid transformer components☆46Jul 20, 2024Updated last year
- Code repository for "RL Grokking Recipe: How RL Unlocks and Transfers New Algorithms in LLMs""☆33Oct 12, 2025Updated 6 months ago
- Landing repository for the paper "Predicting the Order of Upcoming Tokens Improves Language Modeling"☆44Sep 12, 2025Updated 7 months ago
- ☆23Mar 7, 2025Updated last year
- Official implementation of ECCV24 paper: POA☆24Aug 8, 2024Updated last year
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Conformer block with Rotary Position Embedding, modified from lucidrains' implement☆18Sep 13, 2024Updated last year
- My Implementation of Q-Sparse: All Large Language Models can be Fully Sparsely-Activated☆34Aug 14, 2024Updated last year
- Reliable, minimal and scalable library for pretraining foundation and world models☆185Updated this week
- [ICLR 2025] Official implementation of paper "Dynamic Low-Rank Sparse Adaptation for Large Language Models".☆24Mar 16, 2025Updated last year
- [NeurIPS 2023] Latent Graph Inference with Limited Supervision☆28Feb 1, 2024Updated 2 years ago
- The official repo for the code and data of paper SMART☆40Feb 20, 2025Updated last year
- Experiments in Joint Embedding Predictive Architectures (JEPAs).☆50Jan 5, 2024Updated 2 years ago
- Custom triton kernels for training Karpathy's nanoGPT.☆19Oct 21, 2024Updated last year
- About The official GitHub page for ''Unleashing the Potential of Large Language Models as Prompt Optimizers: An Analogical Analysis with …☆29Dec 12, 2024Updated last year
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- AGiXT is a dynamic AI Automation Platform that seamlessly orchestrates instruction management and complex task execution across diverse A…☆24Jan 26, 2026Updated 2 months ago
- This repo contains the code for paper "nuCarla: A nuScenes-Style Bird’s-Eye View Perception Dataset for CARLA Simulation"☆49Jan 2, 2026Updated 3 months ago
- RWKV-7 mini☆12Mar 29, 2025Updated last year
- ☆11Jun 22, 2025Updated 9 months ago
- Official repository for the paper Local Linear Attention: An Optimal Interpolation of Linear and Softmax Attention For Test-Time Regressi…☆23Oct 1, 2025Updated 6 months ago
- Interface for GenAI-Arena [NeurIPS24]☆17Feb 27, 2024Updated 2 years ago
- Deal with badly organized projects☆11Jan 4, 2026Updated 3 months ago
- ☆12Oct 18, 2023Updated 2 years ago
- All information and news with respect to Falcon-H1 series☆114Oct 9, 2025Updated 6 months ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- code for EMNLP 2024 paper: How do Large Language Models Learn In-Context? Query and Key Matrices of In-Context Heads are Two Towers for M…☆13Nov 17, 2024Updated last year
- MLX Implementation of Recursive Reasoning with Tiny Networks☆78Oct 11, 2025Updated 6 months ago
- DISCO: Comprehensive and Explainable Disinformation Detection, CIKM 2022☆10May 5, 2023Updated 2 years ago
- RWKV-LM-V7(https://github.com/BlinkDL/RWKV-LM) Under Lightning Framework☆59Dec 24, 2025Updated 3 months ago
- Simple and Ideal Circuit Simulation☆13Dec 4, 2017Updated 8 years ago
- Extending the Context of Pretrained LLMs by Dropping Their Positional Embedding☆213Jan 12, 2026Updated 3 months ago
- A powerful white-box adversarial attack that exploits knowledge about the geometry of neural networks to find minimal adversarial perturb…☆12Aug 5, 2020Updated 5 years ago