apple / ml-diffucoderLinks
DiffuCoder: Understanding and Improving Masked Diffusion Models for Code Generation
☆726Updated 2 months ago
Alternatives and similar repositories for ml-diffucoder
Users that are interested in ml-diffucoder are comparing it to the libraries listed below
Sorting:
- Dream 7B, a large diffusion language model☆959Updated 3 weeks ago
- Training teachers with reinforcement learning able to make LLMs learn how to reason for test time scaling.☆339Updated 2 months ago
- Hypernetworks that adapt LLMs for specific benchmark tasks using only textual task description as the input☆859Updated 3 months ago
- Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models☆807Updated 2 months ago
- Parallel Scaling Law for Language Model — Beyond Parameter and Inference Time Scaling☆438Updated 3 months ago
- Simple & Scalable Pretraining for Neural Architecture Research☆291Updated 3 weeks ago
- Self-Adapting Language Models☆781Updated last month
- codes for R-Zero: Self-Evolving Reasoning LLM from Zero Data (https://www.arxiv.org/pdf/2508.05004)☆601Updated this week
- Scaling RL on advanced reasoning models☆583Updated last month
- Pretraining and inference code for a large-scale depth-recurrent language model☆826Updated last week
- Memory layers use a trainable key-value lookup mechanism to add extra parameters to a model without increasing FLOPs. Conceptually, spars…☆343Updated 9 months ago
- Official Implementation for the paper "d1: Scaling Reasoning in Diffusion Large Language Models via Reinforcement Learning"☆302Updated 2 months ago
- Official PyTorch implementation for Hogwild! Inference: Parallel LLM Generation with a Concurrent Attention Cache☆123Updated last month
- An AI benchmark for creative, human-like problem solving using Sudoku variants☆97Updated last month
- Checkpoint-engine is a simple middleware to update model weights in LLM inference engines☆516Updated this week
- [ICLR2025] DiffuGPT and DiffuLLaMA: Scaling Diffusion Language Models via Adaptation from Autoregressive Models☆296Updated 3 months ago
- Code for the paper: "Learning to Reason without External Rewards"☆351Updated 2 months ago
- ☆477Updated last month
- GRadient-INformed MoE☆264Updated 11 months ago
- A Scientific Multimodal Foundation Model☆561Updated last week
- Official PyTorch implementation for ICLR2025 paper "Scaling up Masked Diffusion Models on Text"☆293Updated 8 months ago
- Official implementation of the paper "Soft Thinking: Unlocking the Reasoning Potential of LLMs in Continuous Concept Space"☆219Updated last week
- ☆786Updated 2 weeks ago
- Releases from OpenAI Preparedness☆857Updated 2 weeks ago
- Esoteric Language Models☆96Updated last month
- Chain of Experts (CoE) enables communication between experts within Mixture-of-Experts (MoE) models☆220Updated this week
- Benchmark environment for evaluating vision-language models (VLMs) on popular video games!☆302Updated 3 months ago
- ☆1,122Updated last week
- MMaDA - Open-Sourced Multimodal Large Diffusion Language Models☆1,341Updated 3 weeks ago
- Mixture-of-Recursions: Learning Dynamic Recursive Depths for Adaptive Token-Level Computation☆435Updated last month