☆211Oct 27, 2025Updated 4 months ago
Alternatives and similar repositories for steplaw
Users that are interested in steplaw are comparing it to the libraries listed below
Sorting:
- Ongoing research project for code&math LLMs☆27Jul 4, 2025Updated 8 months ago
- Heuristic filtering framework for RefineCode☆83Mar 13, 2025Updated 11 months ago
- [ACL2025 Oral🔥]Turning Trash into Treasure: Accelerating Inference of Large Language Models with Token Recycling☆22Nov 11, 2025Updated 3 months ago
- ☆22Oct 22, 2024Updated last year
- Research work aimed at addressing the problem of modeling infinite-length context☆46Dec 18, 2025Updated 2 months ago
- CVE-Factory☆58Feb 13, 2026Updated 3 weeks ago
- LongAttn :Selecting Long-context Training Data via Token-level Attention☆15Jul 16, 2025Updated 7 months ago
- Automatic prompt optimization framework for multi-step agent tasks.☆37Nov 12, 2024Updated last year
- ☆109Jul 15, 2025Updated 7 months ago
- [NeurIPS 2024] Fast Best-of-N Decoding via Speculative Rejection☆55Oct 29, 2024Updated last year
- ☆453Aug 10, 2025Updated 6 months ago
- The RedStone repository includes code for preparing extensive datasets used in training large language models.☆156Jan 22, 2026Updated last month
- ☆50Aug 21, 2025Updated 6 months ago
- Muon is Scalable for LLM Training☆1,440Aug 3, 2025Updated 7 months ago
- Use the tokenizer in parallel to achieve superior acceleration☆20Mar 21, 2024Updated last year
- ☆55Feb 24, 2026Updated last week
- A Comprehensive Survey on Long Context Language Modeling☆228Nov 24, 2025Updated 3 months ago
- The official implementation for the intra-stage fusion technique introduced in https://arxiv.org/abs/2409.13221☆31Apr 22, 2025Updated 10 months ago
- Official Repo for Open-Reasoner-Zero☆2,087Jun 2, 2025Updated 9 months ago
- ☆129Jun 6, 2025Updated 9 months ago
- [ICLR 2025] Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing. Your efficient and high-quality synthetic data …☆833Mar 17, 2025Updated 11 months ago
- This repo is to demo the concept of lossless compression with Transformers as encoder and decoder.☆14May 2, 2024Updated last year
- The code and data for the paper JiuZhang3.0☆49May 26, 2024Updated last year
- FlashCosyVoice: A lightweight vLLM implementation built from scratch for CosyVoice.☆242Feb 25, 2026Updated last week
- Muon fsdp 2☆55Aug 8, 2025Updated 6 months ago
- Chain of Experts (CoE) enables communication between experts within Mixture-of-Experts (MoE) models☆227Nov 4, 2025Updated 4 months ago
- ☆978Feb 7, 2025Updated last year
- A series of technical report on Slow Thinking with LLM☆761Aug 13, 2025Updated 6 months ago
- Reinforcing General Reasoning without Verifiers☆96Jun 24, 2025Updated 8 months ago
- [NeurIPS 2024] The official implementation of paper: Chain of Preference Optimization: Improving Chain-of-Thought Reasoning in LLMs.☆134Mar 21, 2025Updated 11 months ago
- InternEvo is an open-sourced lightweight training framework aims to support model pre-training without the need for extensive dependencie…☆418Aug 21, 2025Updated 6 months ago
- ☆813Jun 9, 2025Updated 8 months ago
- Reproduce R1 Zero on Logic Puzzle☆2,439Mar 20, 2025Updated 11 months ago
- An official implementation of Style-Talker for Spoken Dialogue Generation☆23Jan 12, 2025Updated last year
- [ICML 2025] Programming Every Example: Lifting Pre-training Data Quality Like Experts at Scale☆266Jul 8, 2025Updated 7 months ago
- O1 Replication Journey☆2,000Jan 14, 2025Updated last year
- Fast LLM Training CodeBase With dynamic strategy choosing [Deepspeed+Megatron+FlashAttention+CudaFusionKernel+Compiler];☆40Jan 4, 2024Updated 2 years ago
- Async pipelined version of Verl☆124Apr 8, 2025Updated 10 months ago
- 5Hz Deep-Compression Speech VAE for AR-Diffusion and CALMs☆57Nov 19, 2025Updated 3 months ago