chunhuizhang / prompts_for_academicLinks
☆105Updated last month
Alternatives and similar repositories for prompts_for_academic
Users that are interested in prompts_for_academic are comparing it to the libraries listed below
Sorting:
- This is a repo for showcasing using MCTS with LLMs to solve gsm8k problems☆94Updated 2 months ago
- llm & rl☆268Updated 3 months ago
- PSFT is a trust-region–inspired fine-tuning objective that views SFT as a policy gradient method with constant advantages, constraining p…☆34Updated 4 months ago
- This is the reading list for the survey "A Survey on the Optimization of LLM-based Agents ". We will keep adding papers and improving the…☆184Updated 6 months ago
- ☆186Updated 3 months ago
- Generative AI Act II: Test Time Scaling Drives Cognition Engineering☆209Updated 9 months ago
- ☆207Updated last week
- Harnessing the Reasoning Economy: A Survey of Efficient Reasoning for Large Language Models☆122Updated 3 months ago
- Awesome LLM pre-training resources, including data, frameworks, and methods.☆318Updated 9 months ago
- Reinforcement Learning in LLM and NLP.☆62Updated last month
- [AAAI 2026] Official codebase for "GenPRM: Scaling Test-Time Compute of Process Reward Models via Generative Reasoning".☆95Updated 2 months ago
- The official implementation of the paper "Mem-α: Learning Memory Construction via Reinforcement Learning"☆149Updated last month
- A visuailzation tool to make deep understaning and easier debugging for RLHF training.☆281Updated 11 months ago
- A research repo for experiments about Reinforcement Finetuning☆53Updated 9 months ago
- ☆179Updated 9 months ago
- OpenJudge: A Unified Framework for Holistic Evaluation and Quality Rewards☆267Updated this week
- MemGen: Weaving Generative Latent Memory for Self-Evolving Agents☆290Updated 2 months ago
- A Comprehensive Survey on Long Context Language Modeling☆219Updated 2 months ago
- A highly capable 2.4B lightweight LLM using only 1T pre-training data with all details.☆222Updated 6 months ago
- Scaling Preference Data Curation via Human-AI Synergy☆137Updated 6 months ago
- ☆41Updated 10 months ago
- MiroRL is an MCP-first reinforcement learning framework for deep research agent.☆225Updated 5 months ago
- ☆78Updated 8 months ago
- Towards a Unified View of Large Language Model Post-Training☆199Updated 4 months ago
- A Collection of Papers about Memory for Language Agents☆289Updated last week
- WritingBench: A Comprehensive Benchmark for Generative Writing☆155Updated last month
- ☆222Updated last month
- Curated systems, benchmarks, and papers etc. on memory for LLMs/MLLMs --- long-term context, retrieval, and reasoning.☆202Updated this week
- ☆67Updated last year
- OpenRFT: Adapting Reasoning Foundation Model for Domain-specific Tasks with Reinforcement Fine-Tuning☆155Updated last year