NVlabs / Jet-NemotronLinks
☆619Updated 3 weeks ago
Alternatives and similar repositories for Jet-Nemotron
Users that are interested in Jet-Nemotron are comparing it to the libraries listed below
Sorting:
- DFloat11: Lossless LLM Compression for Efficient GPU Inference☆536Updated 3 weeks ago
- codes for R-Zero: Self-Evolving Reasoning LLM from Zero Data (https://www.arxiv.org/pdf/2508.05004)☆601Updated last week
- Checkpoint-engine is a simple middleware to update model weights in LLM inference engines☆516Updated this week
- Self-Adapting Language Models☆785Updated last month
- CUDA-L1: Improving CUDA Optimization via Contrastive Reinforcement Learning☆183Updated last month
- ☆797Updated this week
- ☆322Updated last week
- Training teachers with reinforcement learning able to make LLMs learn how to reason for test time scaling.☆339Updated 2 months ago
- GRadient-INformed MoE☆264Updated 11 months ago
- Sparse Inferencing for transformer based LLMs☆197Updated last month
- Code to train and evaluate Neural Attention Memory Models to obtain universally-applicable memory systems for transformers.☆322Updated 10 months ago
- Code to accompany the Universal Deep Research paper (https://arxiv.org/abs/2509.00244)☆370Updated 3 weeks ago
- Tencent Hunyuan A13B (short as Hunyuan-A13B), an innovative and open-source LLM built on a fine-grained MoE architecture.☆748Updated 2 months ago
- A Tree Search Library with Flexible API for LLM Inference-Time Scaling☆458Updated last month
- AlphaGo Moment for Model Architecture Discovery.☆1,072Updated last month
- Hypernetworks that adapt LLMs for specific benchmark tasks using only textual task description as the input☆859Updated 3 months ago
- ☆295Updated last month
- Collection of scripts and notebooks for OpenAI's latest GPT OSS models☆437Updated 3 weeks ago
- VPTQ, A Flexible and Extreme low-bit quantization algorithm☆655Updated 4 months ago
- Simple & Scalable Pretraining for Neural Architecture Research☆291Updated 3 weeks ago
- Official PyTorch implementation for Hogwild! Inference: Parallel LLM Generation with a Concurrent Attention Cache☆123Updated last month
- Decentralized RL Training at Scale☆592Updated this week
- A Self-adaptation Framework🐙 that adapts LLMs for unseen tasks in real-time!☆1,141Updated 7 months ago
- prime is a framework for efficient, globally distributed training of AI models over the internet.☆819Updated 3 months ago
- The code repository of the paper: Competition and Attraction Improve Model Fusion☆150Updated 3 weeks ago
- Mixture-of-Recursions: Learning Dynamic Recursive Depths for Adaptive Token-Level Computation☆435Updated last month
- Code and data for the Chain-of-Draft (CoD) paper☆325Updated 6 months ago
- ☆196Updated 9 months ago
- An open source implementation of LFMs from Liquid AI: Liquid Foundation Models☆111Updated 11 months ago
- Inference, Fine Tuning and many more recipes with Gemma family of models☆266Updated last month