iiisthu / ailabLinks
☆36Updated 7 months ago
Alternatives and similar repositories for ailab
Users that are interested in ailab are comparing it to the libraries listed below
Sorting:
- ☆24Updated 2 months ago
- A comprehensive framework for benchmarking single and multi-agent systems across a wide range of tasks—evaluating performance, accuracy, …☆35Updated 2 months ago
- [ICLR 2025 Oral] PyTorch code for the paper "Open-World Reinforcement Learning over Long Short-Term Imagination"☆188Updated 2 months ago
- Run TRex with PPO☆39Updated 7 months ago
- Official repository for "CODI: Compressing Chain-of-Thought into Continuous Space via Self-Distillation"☆58Updated 3 weeks ago
- Curation of resources for LLM research, screened by @tongyx361 to ensure high quality and accompanied with elaborately-written concise de…☆63Updated last year
- Official Repository of "Learning to Reason under Off-Policy Guidance"☆395Updated 3 months ago
- The Entropy Mechanism of Reinforcement Learning for Large Language Model Reasoning.☆407Updated 6 months ago
- Shanghai Jiao Tong University 2023-2024, CS3601 Operating System☆21Updated 2 years ago
- A RL Framework for multi LLM agent system☆91Updated this week
- 在没有sudo权限的情况下,在linux上使用clash☆165Updated last year
- ☆326Updated 7 months ago
- Paper list for Efficient Reasoning.☆784Updated 2 weeks ago
- ☆208Updated 5 months ago
- Training VLM agents with multi-turn reinforcement learning☆365Updated last week
- llm & rl☆266Updated 2 months ago
- Super-Efficient RLHF Training of LLMs with Parameter Reallocation☆330Updated 8 months ago
- 😎 A Survey of Efficient Reasoning for Large Reasoning Models: Language, Multimodality, Agent, and Beyond☆325Updated last week
- ☆21Updated 5 months ago
- Official Repo for Fine-Tuning Large Vision-Language Models as Decision-Making Agents via Reinforcement Learning☆404Updated last year
- ☆83Updated last year
- The MiniAgents visualization tool for simulacra.☆17Updated last year
- A toolbox for benchmarking Multimodal LLM Agents trustworthiness across truthfulness, controllability, safety and privacy dimensions thro…☆61Updated this week
- MemGen: Weaving Generative Latent Memory for Self-Evolving Agents☆263Updated last month
- 清华大学云盘 (Tsinghua Cloud) 批量下载助手,适用于分享的文件 size 过大导致无法直接下载的情况,本脚本添加了更多实用的小功能☆226Updated last year
- A visuailzation tool to make deep understaning and easier debugging for RLHF training.☆278Updated 10 months ago
- Monitoring recent cross-research on LLM & RL on arXiv for control. If there are good papers, PRs are welcome.☆535Updated last month
- [NeurIPS 2025] TTRL: Test-Time Reinforcement Learning☆945Updated 3 months ago
- This repository contains a regularly updated paper list for LLMs-reasoning-in-latent-space.☆253Updated last week
- ReST-MCTS*: LLM Self-Training via Process Reward Guided Tree Search (NeurIPS 2024)☆686Updated 11 months ago