KMnO4-zx / paper-agentLinks
something for paper agent
☆11Updated last year
Alternatives and similar repositories for paper-agent
Users that are interested in paper-agent are comparing it to the libraries listed below
Sorting:
- 🎓Automatically Update agent Papers Daily using Github Actions (Update Every 12th hours)每日更新agent相关论文(已附带中文摘要翻译)☆31Updated last week
- ☆31Updated 6 months ago
- MLLM @ Game☆16Updated 8 months ago
- ☆43Updated 2 months ago
- ☆68Updated last year
- 训练一个对中文支持更好的LLaVA模型,并开源训练代码和数据。☆79Updated last year
- 本项目用于大模型数学解题能力方面的数据集合成,模型训练及评测,相关文章记录。☆99Updated last year
- A highly capable 2.4B lightweight LLM using only 1T pre-training data with all details.☆222Updated 6 months ago
- This is a repo for showcasing using MCTS with LLMs to solve gsm8k problems☆94Updated 2 months ago
- ZO2 (Zeroth-Order Offloading): Full Parameter Fine-Tuning 175B LLMs with 18GB GPU Memory [COLM2025]☆199Updated 6 months ago
- 通义千问的DPO训练☆61Updated last year
- ☆56Updated last month
- P2P: Automated Paper-to-Poster Generation and Fine-Grained Benchmark☆47Updated 7 months ago
- ☆46Updated 8 months ago
- ☆25Updated 9 months ago
- llm & rl☆271Updated 3 months ago
- LLM Tokenizer with BPE algorithm☆47Updated last year
- 项目的issue会存放我的所有blog☆18Updated 4 months ago
- ☆130Updated 3 months ago
- 超简单复现Deepseek-R1-Zero和Deepseek-R1,以「24点游戏」为例。通过zero-RL、SFT以及SFT+RL,以激发LLM的自主验证反思能力。 About Clean, minimal, accessible reproduction of Dee…☆33Updated 10 months ago
- ☆209Updated 3 months ago
- ICML2025: Forest-of-Thought: Scaling Test-Time Compute for Enhancing LLM Reasoning☆51Updated 9 months ago
- ☆180Updated 9 months ago
- [ICLR 2026] PSFT is a trust-region–inspired fine-tuning objective that views SFT as a policy gradient method with constant advantages, co…☆34Updated 4 months ago
- Rethinking RL Scaling for Vision Language Models: A Transparent, From-Scratch Framework and Comprehensive Evaluation Scheme☆147Updated 9 months ago
- A mini assistant to help you read paper quickly☆54Updated 8 months ago
- ☆111Updated 7 months ago
- A small open source 3D agent simulator based on LLM.☆69Updated last year
- CoT-Valve: Length-Compressible Chain-of-Thought Tuning☆89Updated 11 months ago
- ☆63Updated 9 months ago