NVlabs / Jet-NemotronLinks
☆674Updated this week
Alternatives and similar repositories for Jet-Nemotron
Users that are interested in Jet-Nemotron are comparing it to the libraries listed below
Sorting:
- Research code artifacts for Code World Model (CWM) including inference tools, reproducibility, and documentation.☆609Updated last week
- DFloat11: Lossless LLM Compression for Efficient GPU Inference☆545Updated last month
- Checkpoint-engine is a simple middleware to update model weights in LLM inference engines☆751Updated last week
- CUDA-L1: Improving CUDA Optimization via Contrastive Reinforcement Learning☆193Updated last month
- codes for R-Zero: Self-Evolving Reasoning LLM from Zero Data (https://www.arxiv.org/pdf/2508.05004)☆638Updated this week
- Welcome to the official repository of SINQ! A novel, fast and high-quality quantization method designed to make any Large Language Model …☆20Updated this week
- Self-Adapting Language Models☆800Updated 2 months ago
- Advanced Quantization Algorithm for LLMs and VLMs, with support for CPU, Intel GPU, CUDA and HPU.☆651Updated this week
- Tencent Hunyuan A13B (short as Hunyuan-A13B), an innovative and open-source LLM built on a fine-grained MoE architecture.☆754Updated 3 months ago
- VPTQ, A Flexible and Extreme low-bit quantization algorithm☆658Updated 5 months ago
- Mixture-of-Recursions: Learning Dynamic Recursive Depths for Adaptive Token-Level Computation (NeurIPS 2025)☆461Updated last week
- Training teachers with reinforcement learning able to make LLMs learn how to reason for test time scaling.☆343Updated 3 months ago
- Hypernetworks that adapt LLMs for specific benchmark tasks using only textual task description as the input☆873Updated 3 months ago
- Simple & Scalable Pretraining for Neural Architecture Research☆296Updated last month
- The code repository of the paper: Competition and Attraction Improve Model Fusion☆159Updated last month
- A Tree Search Library with Flexible API for LLM Inference-Time Scaling☆475Updated 2 months ago
- Sparse Inferencing for transformer based LLMs☆201Updated last month
- ☆388Updated this week
- ☆199Updated 9 months ago
- Post-training with Tinker☆550Updated this week
- Code to train and evaluate Neural Attention Memory Models to obtain universally-applicable memory systems for transformers.☆322Updated 11 months ago
- The offical repo for "Parallel-R1: Towards Parallel Thinking via Reinforcement Learning"☆186Updated 2 weeks ago
- Official PyTorch implementation for Hogwild! Inference: Parallel LLM Generation with a Concurrent Attention Cache☆125Updated last month
- All information and news with respect to Falcon-H1 series☆89Updated last month
- Collection of scripts and notebooks for OpenAI's latest GPT OSS models☆456Updated last month
- ☆821Updated 3 weeks ago
- A command-line interface tool for serving LLM using vLLM.☆418Updated last month
- On the Theoretical Limitations of Embedding-Based Retrieval☆567Updated 3 weeks ago
- A Self-adaptation Framework🐙 that adapts LLMs for unseen tasks in real-time!☆1,151Updated 8 months ago
- Code to accompany the Universal Deep Research paper (https://arxiv.org/abs/2509.00244)☆441Updated last month