caiyuchen-ustc / Alpha-RLLinks
On Predictability of Reinforcement Learning Dynamics for Large Language Models
☆45Updated last week
Alternatives and similar repositories for Alpha-RL
Users that are interested in Alpha-RL are comparing it to the libraries listed below
Sorting:
- We introduce temporal working memory (TWM), which aims to enhance the temporal modeling capabilities of Multimodal foundation models (MFM…☆312Updated last week
- [BIRD-INTERACT] Re-imagines Text-to-SQL evaluation via lens of dynamic interactions.☆451Updated 2 weeks ago
- [NeurIPS'25] KVCOMM: Online Cross-context KV-cache Communication for Efficient LLM-based Multi-agent Systems☆96Updated last month
- INFTY Engine: An Optimization Toolkit to Support Continual AI☆512Updated 2 months ago
- Group Expectation Policy Optimization for Heterogeneous Reinforcement Learning☆164Updated 2 weeks ago
- The Python implementation of some deep text hashing (also called deep semantic hashing) Models☆79Updated 3 weeks ago
- ☆356Updated 5 months ago
- [COLM 2025] Assessing Judging Bias in Large Reasoning Models: An Empirical Study https://openreview.net/pdf?id=SlRtFwBdzP☆164Updated 2 months ago
- [MM 2024] Official code for VeCAF: Vision-language Collaborative Active Finetuning with Training Objective Awareness☆52Updated last year
- [AAAI 2026 Oral] Cook and Clean Together: Teaching Embodied Agents for Parallel Task Execution☆352Updated this week
- [ACL 2025 Oral] QAEncoder: Towards Aligned Representation Learning in Question Answering Systems☆176Updated 4 months ago
- ☆174Updated 2 months ago
- Enhanced Benchmark Creation Tool: Automates dataset profiling, model benchmarking, and performance visualization for streamlined evaluati…☆110Updated 3 weeks ago
- The code for Refining Sentence Embedding Model through Ranking Sentences Generation with Large Language Models (Finding of ACL2025)☆83Updated 4 months ago
- ☆515Updated 9 months ago
- AIGC Creative Suite☆202Updated 6 months ago
- F²-Gen - A open source Financial Fraud Detection Data Generator Web Application☆367Updated last month
- ☆422Updated 5 months ago
- Open, reproducible benchmarks and practical recipes to reduce I/O bottlenecks and improve end-to-end performance in AI training and bulk …☆150Updated last month
- React Secure State☆171Updated last month
- A Trusted Human-Multi-Agent Reinforcement Learning Interaction Framework☆503Updated last month
- ☆86Updated 9 months ago
- Treat text as code to audit grammar, ruthlessly reporting errors in compiler style. 把文本当做代码来审查语法,并以编译器风格无情报错。☆181Updated last month
- Revolutionizing Cancer Treatment with AI & Robotics☆65Updated 8 months ago
- Open-source models for financial risk detection and fraud analytics☆426Updated 3 weeks ago
- Integrated Plant Single- Cell Database☆168Updated 4 months ago
- This is the project for the paper of "Low-Light Video Enhancement via Spatial-Temporal Consistent Decomposition" in IJCAI2025☆84Updated 4 months ago
- A project aims to improve LLMs' pixel reasoning ability.☆81Updated 3 months ago
- 这是一个数据分析项目 this is a data analysis project, thanks for watching☆82Updated 2 months ago
- For Instruction of GPT☆34Updated last year