xjdr-alt / entropix
Entropy Based Sampling and Parallel CoT Decoding
☆2,970Updated this week
Related projects ⓘ
Alternatives and complementary repositories for entropix
- Optimizing inference proxy for LLMs☆1,366Updated this week
- NanoGPT (124M) quality in 7.8 8xH100-minutes☆965Updated this week
- ☆2,737Updated 2 months ago
- The Enterprise-Grade Production-Ready Multi-Agent Orchestration Framework Join our Community: https://discord.com/servers/agora-999382051…☆1,730Updated this week
- System 2 Reasoning Link Collection☆687Updated 2 weeks ago
- A benchmark to evaluate language models on questions I've previously asked them to solve.☆908Updated last week
- o1-engineer is a command-line tool designed to assist developers in managing and interacting with their projects efficiently. Leveraging …☆2,794Updated this week
- Meta Lingua: a lean, efficient, and easy-to-hack codebase to research LLMs.☆4,178Updated last week
- High performance AI inference stack. Built for production. @ziglang / @openxla / MLIR / @bazelbuild☆1,639Updated this week
- Mixture of Agents using Groq☆922Updated 3 months ago
- Automated Design of Agentic Systems☆1,021Updated last week
- g1: Using Llama-3.1 70b on Groq to create o1-like reasoning chains☆3,873Updated last month
- Things you can do with the token embeddings of an LLM☆1,311Updated this week
- Tools for merging pretrained large language models.☆4,798Updated last week
- A framework for serving and evaluating LLM routers - save LLM costs without compromising quality!☆3,220Updated 3 months ago
- AdalFlow: The library to build & auto-optimize LLM applications.☆2,002Updated this week
- TextGrad: Automatic ''Differentiation'' via Text -- using large language models to backpropagate textual gradients.☆1,805Updated last week
- Official implementation of "Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling"☆801Updated 2 months ago
- Easily use and train state of the art late-interaction retrieval methods (ColBERT) in any RAG pipeline. Designed for modularity and ease-…☆3,041Updated 2 months ago
- Repository for Meta Chameleon, a mixed-modal early-fusion foundation model from FAIR.☆1,829Updated 3 months ago
- LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve spee…☆2,538Updated last month
- A native PyTorch Library for large model training☆2,586Updated last week
- Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verifi…☆1,617Updated this week
- Implementation for MatMul-free LM.☆2,918Updated last week
- Cohere Toolkit is a collection of prebuilt components enabling users to quickly build and deploy RAG applications.☆2,823Updated this week
- Efficient Triton Kernels for LLM Training☆3,401Updated this week
- Distributed Training Over-The-Internet☆683Updated 2 months ago
- ReFT: Representation Finetuning for Language Models☆1,149Updated last week
- ☆920Updated last week
- The code used to train and run inference with the ColPali architecture.☆1,076Updated this week