apple / ml-diffucoderLinks
DiffuCoder: Understanding and Improving Masked Diffusion Models for Code Generation
☆753Updated 3 months ago
Alternatives and similar repositories for ml-diffucoder
Users that are interested in ml-diffucoder are comparing it to the libraries listed below
Sorting:
- Dream 7B, a large diffusion language model☆1,034Updated last month
- Training teachers with reinforcement learning able to make LLMs learn how to reason for test time scaling.☆347Updated 4 months ago
- Hypernetworks that adapt LLMs for specific benchmark tasks using only textual task description as the input☆900Updated 4 months ago
- Simple & Scalable Pretraining for Neural Architecture Research☆297Updated 2 months ago
- Research code artifacts for Code World Model (CWM) including inference tools, reproducibility, and documentation.☆682Updated last month
- codes for R-Zero: Self-Evolving Reasoning LLM from Zero Data (https://www.arxiv.org/pdf/2508.05004)☆656Updated 3 weeks ago
- [ICLR 2025 Oral] Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models☆864Updated 3 months ago
- Scaling RL on advanced reasoning models☆620Updated last week
- ShinkaEvolve: Towards Open-Ended and Sample-Efficient Program Evolution☆584Updated 2 weeks ago
- Parallel Scaling Law for Language Model — Beyond Parameter and Inference Time Scaling☆448Updated 5 months ago
- Post-training with Tinker☆1,096Updated last week
- Pretraining and inference code for a large-scale depth-recurrent language model☆838Updated last week
- [ICLR2025] DiffuGPT and DiffuLLaMA: Scaling Diffusion Language Models via Adaptation from Autoregressive Models☆320Updated 4 months ago
- Memory layers use a trainable key-value lookup mechanism to add extra parameters to a model without increasing FLOPs. Conceptually, spars…☆347Updated 10 months ago
- Official PyTorch implementation for Hogwild! Inference: Parallel LLM Generation with a Concurrent Attention Cache☆127Updated 2 months ago
- Benchmark environment for evaluating vision-language models (VLMs) on popular video games!☆308Updated 5 months ago
- Code to train and evaluate Neural Attention Memory Models to obtain universally-applicable memory systems for transformers.☆325Updated last year
- Large multi-modal models (L3M) pre-training.☆217Updated last month
- [NeurIPS 2025] Reinforcement Learning for Reasoning in Large Language Models with One Training Example☆365Updated 2 weeks ago
- Official Implementation for the paper "d1: Scaling Reasoning in Diffusion Large Language Models via Reinforcement Learning"☆339Updated 3 months ago
- The offical repo for "Parallel-R1: Towards Parallel Thinking via Reinforcement Learning"☆229Updated last week
- Official implementation of the NeurIPS 2025 paper "Soft Thinking: Unlocking the Reasoning Potential of LLMs in Continuous Concept Space"☆262Updated last month
- PyTorch-native post-training at scale☆388Updated this week
- An AI benchmark for creative, human-like problem solving using Sudoku variants☆105Updated 3 months ago
- Tina: Tiny Reasoning Models via LoRA☆302Updated last month
- RLP: Reinforcement as a Pretraining Objective☆192Updated 3 weeks ago
- ☆828Updated last month
- ☆543Updated last month
- A Scientific Multimodal Foundation Model☆587Updated last month
- Esoteric Language Models☆103Updated 3 weeks ago