jerber / lang-jepa
☆94Updated 2 weeks ago
Alternatives and similar repositories for lang-jepa:
Users that are interested in lang-jepa are comparing it to the libraries listed below
- OpenCoconut implements a latent reasoning paradigm where we generate thoughts before decoding.☆136Updated this week
- smolLM with Entropix sampler on pytorch☆147Updated 2 months ago
- ☆96Updated 2 months ago
- look how they massacred my boy☆63Updated 2 months ago
- ☆121Updated 4 months ago
- Simple Transformer in Jax☆127Updated 6 months ago
- MiniHF is an inference, human preference data collection, and fine-tuning tool for local language models. It is intended to help the user…☆162Updated this week
- An easy-to-understand framework for LLM samplers that rewind and revise generated tokens☆118Updated 2 months ago
- ☆115Updated 3 weeks ago
- ☆79Updated this week
- The simplest, fastest repository for training/finetuning medium-sized GPTs.☆89Updated last month
- smol models are fun too☆85Updated 2 months ago
- An introduction to LLM Sampling☆75Updated 3 weeks ago
- Draw more samples☆182Updated 6 months ago
- MLX port for xjdr's entropix sampler (mimics jax implementation)☆59Updated 2 months ago
- DeMo: Decoupled Momentum Optimization☆163Updated last month
- σ-GPT: A New Approach to Autoregressive Models☆61Updated 4 months ago
- supporting pytorch FSDP for optimizers☆75Updated last month
- Code to reproduce "Transformers Can Do Arithmetic with the Right Embeddings", McLeish et al (NeurIPS 2024)☆181Updated 7 months ago
- Entropy Based Sampling and Parallel CoT Decoding☆17Updated 3 months ago
- Long context evaluation for large language models☆195Updated this week
- A comprehensive repository of reasoning tasks for LLMs (and beyond)☆296Updated 3 months ago
- Archon provides a modular framework for combining different inference-time techniques and LMs with just a JSON config file.☆150Updated 2 months ago
- ☆68Updated 4 months ago
- $100K or 100 Days: Trade-offs when Pre-Training with Academic Resources☆118Updated 2 weeks ago
- Repository for the paper Stream of Search: Learning to Search in Language☆116Updated 5 months ago
- A MAD laboratory to improve AI architecture designs 🧪☆98Updated 3 weeks ago
- Collection of autoregressive model implementation☆76Updated this week
- ☆51Updated 3 weeks ago
- The history files when recording human interaction while solving ARC tasks☆96Updated this week