☆68Nov 26, 2024Updated last year
Alternatives and similar repositories for skywork-o1-prm-inference
Users that are interested in skywork-o1-prm-inference are comparing it to the libraries listed below
Sorting:
- A thin cython wrapper around llama.cpp, whisper.cpp and stable-diffusion.cpp☆16Feb 10, 2026Updated 3 weeks ago
- Useful Collection of Claude Code Configurations☆24Updated this week
- [ICML‘25] Official code for paper "Occult: Optimizing Collaborative Communication across Experts for Accelerated Parallel MoE Training an…☆13Apr 17, 2025Updated 10 months ago
- Reasoning Activation in LLMs via Small Model Transfer (NeurIPS 2025)☆21Oct 16, 2025Updated 4 months ago
- ☆23Oct 23, 2025Updated 4 months ago
- official implementation of paper "Process Reward Model with Q-value Rankings"☆66Feb 5, 2025Updated last year
- 23种设计模式的不同语言实现。其中PHP, Java版是好好写的样例代码,Python和JavaScript版会尝试用一些语言级别的特性来比较Java版的设计模式。而C语言版的存在是因为我读了一本小册子,介绍如何把C语言写成面像对象化,因此只是做一些尝试。最后,Lisp和S…☆13Feb 22, 2013Updated 13 years ago
- An automated data pipeline scaling RL to pretraining levels☆73Oct 11, 2025Updated 4 months ago
- [ICML 2025] Satori: Reinforcement Learning with Chain-of-Action-Thought Enhances LLM Reasoning via Autoregressive Search☆108Jun 3, 2025Updated 9 months ago
- OpenR: An Open Source Framework for Advanced Reasoning with Large Language Models☆1,835Jan 17, 2025Updated last year
- UnifiedToolHub is a comprehensive project supporting LLM-based tool use, designed to unify various tool-use dataset formats and provide t…☆19Jul 23, 2025Updated 7 months ago
- various experiments for scaling inference time compute with small reasoning models☆17Jan 16, 2025Updated last year
- Deploying full-stack on-prem deep research agent that can be run entirely on a local machine for $0!☆33Nov 8, 2025Updated 4 months ago
- A powerful system for crawling documentation websites, extracting code snippets, and providing fast search capabilities via MCP (Model C…☆27Dec 25, 2025Updated 2 months ago
- ☆968Jan 23, 2025Updated last year
- ☆1,344Nov 21, 2024Updated last year
- ☆11Updated this week
- private-machine is an AI companion system with emotion, needs and goals simulation. Very silly, not based on real science.☆30Feb 26, 2026Updated last week
- ☆28Oct 2, 2025Updated 5 months ago
- Luann (fka TypeAgent) allows you to create many LLM based agent(Various types of agent,scale up)☆24Feb 9, 2026Updated 3 weeks ago
- This repo offers advanced tutorials for LLMs, BERT-based models, and multimodal models, covering fine-tuning, quantization, vocabulary ex…☆24May 5, 2025Updated 10 months ago
- Qwen-WisdomVast is a large model trained on 1 million high-quality Chinese multi-turn SFT data, 200,000 English multi-turn SFT data, and …☆18Apr 12, 2024Updated last year
- ☆18Dec 12, 2025Updated 2 months ago
- Unleashing the Power of Reinforcement Learning for Math and Code Reasoners☆742Jun 6, 2025Updated 9 months ago
- [NeurIPS 2025 Spotlight] LLM post-training suite — featuring ReasonFlux, ReasonFlux-PRM, and ReasonFlux-Coder.☆521Sep 27, 2025Updated 5 months ago
- ReST-MCTS*: LLM Self-Training via Process Reward Guided Tree Search (NeurIPS 2024)☆692Jan 20, 2025Updated last year
- O1 Replication Journey☆2,000Jan 14, 2025Updated last year
- ☆24Jan 22, 2025Updated last year
- A WebUI for Side-by-Side Comparison of Media (Images/Videos) Across Multiple Folders☆25Feb 21, 2025Updated last year
- An Open Large Reasoning Model for Real-World Solutions☆1,537Feb 13, 2026Updated 3 weeks ago
- Agent Skill Induction: "Inducing Programmatic Skills for Agentic Tasks"☆39Apr 24, 2025Updated 10 months ago
- Pytorch hierarchical attention neural network for text classification☆24Sep 1, 2017Updated 8 years ago
- OpenPipe Reinforcement Learning Experiments☆32Mar 14, 2025Updated 11 months ago
- Writing Tools, Apple's AI-inspired app, enchants Windows, enhancing your pen with AI LLMs. One hotkey press, system-wide, fixes grammar, …☆27Jul 26, 2025Updated 7 months ago
- A fast, local, and secure approach for training LLMs for coding tasks using GRPO with WebAssembly and interpreter feedback.☆41Apr 4, 2025Updated 11 months ago
- ☆57Feb 10, 2025Updated last year
- [arxiv: 2512.19673] Bottom-up Policy Optimization: Your Language Model Policy Secretly Contains Internal Policies☆61Feb 6, 2026Updated last month
- a website for accessing many models through api(deepseek、Qwen、Hunyuan etc.)☆17Jul 12, 2025Updated 7 months ago
- unofficial implementation of the CoT-decoding method for extract cot paths in an unsupervised way☆21Jan 11, 2026Updated last month