X-PLUG / OSWorld-MCPLinks
☆203Updated last week
Alternatives and similar repositories for OSWorld-MCP
Users that are interested in OSWorld-MCP are comparing it to the libraries listed below
Sorting:
- DPO-Shift: Shifting the Distribution of Direct Preference Optimization☆60Updated 9 months ago
- Marco Search Agent for Realistic and Challenging Agentic Search☆240Updated 2 months ago
- Code for "FaithLens: Detecting and Explaining Faithfulness Hallucination"☆89Updated last week
- Dataset and evaluation code of ISDrama(ACM-MM 2025): Immersive Spatial Drama Generation through Multimodal Prompting☆236Updated 4 months ago
- ☆198Updated 2 months ago
- Repo-level benchmark for real-world Code Agents: from repo understanding → env setup → incremental dev/bug-fixing → task delivery, with c…☆244Updated 3 months ago
- ☆207Updated 7 months ago
- A pytorch implementation of the paper "TreeLoRA: Efficient Continual Learning via Layer-Wise LoRAs Guided by a Hierarchical Gradient-Simi…☆343Updated 2 weeks ago
- Official Pytorch implementation for ICML 2025 paper "Large Continual Instruction Assistant"☆66Updated last week
- [AAAI 2026 Oral] Official repository for InfiGUI-G1. We introduce Adaptive Exploration Policy Optimization (AEPO) to overcome semantic al…☆115Updated last month
- ☆127Updated 2 months ago
- This is the code for Visual Reasoning Sequential Attack, which is a method to jailbreak Multimodal Large Language Models Based on their v…☆64Updated 3 weeks ago
- A powerful multi-format file parsing, data cleaning, and AI annotation toolkit.☆143Updated 3 weeks ago
- [MM 2024] Official code for VeCAF: Vision-language Collaborative Active Finetuning with Training Objective Awareness☆52Updated last year
- ☆356Updated 6 months ago
- ☆38Updated 8 months ago
- Group Expectation Policy Optimization for Heterogeneous Reinforcement Learning☆164Updated last month
- This repo collects research papers that use AI tools and are in the field of scientific research (including computer science, agronomy, c…☆98Updated 9 months ago
- 4th Place Solution for the Kaggle Competition: LMSYS - Chatbot Arena Human Preference Predictions☆171Updated last year
- NewtonBench: Benchmarking Generalizable Scientific Law Discovery in LLM Agents☆132Updated 2 weeks ago
- [ACL 2025] FinMME: Benchmark Dataset for Financial Multi-Modal Reasoning Evaluation☆62Updated 6 months ago
- (EMNLP 2025 Findings) Source Evaluation scripts for Humanity's Last Code Exam☆95Updated 4 months ago
- [TMC 2025/NOSSDAV 2023] Official code for RepCaM++ and RepCaM: Re-parameterization Content-aware Modulation for Neural Video Delivery☆54Updated 8 months ago
- [AAAI 2026 Oral] Cook and Clean Together: Teaching Embodied Agents for Parallel Task Execution☆356Updated 3 weeks ago
- [ACL 2025 Oral] QAEncoder: Towards Aligned Representation Learning in Question Answering Systems☆176Updated 5 months ago
- This is the pytorch implementation for AAAI2022 paper "Hierarchical Image Generation via Transformer-Based Sequential Patch Selection"☆84Updated 3 years ago
- A benchmark suite for evaluating LLM-based interactive scientific reasoning.☆91Updated 2 months ago
- Repository for the paper:☆69Updated last year
- A light-weight framework for building llm agentic systems with additional supports for program synthesis and neural-symbolic research.☆88Updated last month
- [COLM 2025] Assessing Judging Bias in Large Reasoning Models: An Empirical Study https://openreview.net/pdf?id=SlRtFwBdzP☆164Updated 3 months ago