Official code for "KnowU-Bench: Towards Interactive, Proactive, and Personalized Mobile Agent Evaluation"
☆63Apr 17, 2026Updated 2 weeks ago
Alternatives and similar repositories for KnowU-Bench
Users that are interested in KnowU-Bench are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [ICLR 2026] InftyThink: Breaking the Length Limits of Long-Context Reasoning in Large Language Models☆53Feb 12, 2026Updated 2 months ago
- Official code for "SKILL0: In-Context Agentic Reinforcement Learning for Skill Internalization"☆228Apr 7, 2026Updated 3 weeks ago
- ☆32Aug 11, 2025Updated 8 months ago
- [NeurIPS 2025] Mind the Gap: Bridging Thought Leap for Improved CoT Tuning https://arxiv.org/abs/2505.14684☆48Oct 20, 2025Updated 6 months ago
- ViewSpatial-Bench:Evaluating Multi-perspective Spatial Localization in Vision-Language Models☆73Mar 9, 2026Updated last month
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- ☆37Oct 9, 2025Updated 6 months ago
- [NeurIPS 2025] Let LRMs Break Free from Overthinking via Self-Braking Tuning. https://arxiv.org/abs/2505.14604☆55Nov 4, 2025Updated 5 months ago
- This repository is the official implementation of TimeHC-RL (Distilabel (Data Generation) + TRL (SFT) + VeRL (GRPO)).☆48Jun 4, 2025Updated 10 months ago
- GSM8K-V: Can Vision Language Models Solve Grade School Math Word Problems in Visual Contexts☆40Sep 30, 2025Updated 7 months ago
- A curated collection of resources, tools, and frameworks for developing GUI Agents.☆409Apr 16, 2026Updated 2 weeks ago
- ☆28Aug 19, 2025Updated 8 months ago
- The officalimplement of dLLM-Factory☆25Jul 12, 2025Updated 9 months ago
- Code for the ACL 2024 paper "PLUG: Leveraging Pivot Language in Cross-Lingual Instruction Tuning"☆14Aug 13, 2025Updated 8 months ago
- The source code of paper: Learning Disentangled Semantic Representations for Zero-Shot Cross-Lingual Transfer in Multilingual Machine Rea…☆12Apr 6, 2022Updated 4 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- ☆13Mar 29, 2023Updated 3 years ago
- Competitive Programming Code Template☆11Nov 6, 2022Updated 3 years ago
- ☆80Updated this week
- code for GaVaMoE: Gaussian-Variational Gated Mixture of Experts for Explainable Recommendation☆18Dec 7, 2024Updated last year
- MDPO: Overcoming the Training-Inference Divide of Masked Diffusion Language Models☆43Jan 28, 2026Updated 3 months ago
- 美赛爬虫,美国大学生数学建模竞赛证书爬取及信息OCR识别分析☆16Jun 25, 2022Updated 3 years ago
- [AAAI 2026] GUI-G²: Gaussian Reward Modeling for GUI Grounding☆307Apr 15, 2026Updated 2 weeks ago
- 批量查询中的思路图☆26Sep 28, 2020Updated 5 years ago
- some small but usuful scripts that help you with RK35588 or other Rockchips☆10May 17, 2023Updated 2 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- ☆17Jun 9, 2025Updated 10 months ago
- mPLM-Sim: Better Cross-Lingual Similarity and Transfer in Multilingual Pretrained Language Models☆11Jan 19, 2024Updated 2 years ago
- Official completion of “Training on the Benchmark Is Not All You Need”.☆40Dec 31, 2024Updated last year
- ☆18May 11, 2025Updated 11 months ago
- ☆40Aug 28, 2025Updated 8 months ago
- ☆12Mar 13, 2025Updated last year
- The repository of the ACCV 2024 paper "FG-CXR: A Radiologist-Aligned Gaze Dataset for Enhancing Interpretability in Chest X-Ray Report Ge…☆11Jul 28, 2025Updated 9 months ago
- Follow Me: Conversation Planning for Target-driven Recommendation Dialogue Systems☆11Aug 1, 2023Updated 2 years ago
- ☆12Mar 31, 2020Updated 6 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Official Implementation of MDK12-Bench: A Multi-Discipline Benchmark for Evaluating Reasoning in Multimodal Large Language Models☆13Nov 1, 2025Updated 6 months ago
- GroundCUA☆125Mar 24, 2026Updated last month
- Ring-V2 is a reasoning MoE LLM provided and open-sourced by InclusionAI.☆97Oct 23, 2025Updated 6 months ago
- The official code of [ICLR 2026] TFPI: Thinking-Free Policy Initialization Makes Distilled Reasoning Models More Effective and Efficient …☆103Jan 27, 2026Updated 3 months ago
- Code and data for "Medical Dialogue Generation via Dual Flow Modeling" (ACL 2023 Findings)☆14Nov 22, 2023Updated 2 years ago
- 🐳 PyLoader: An asynchronous Python dataloader for loading big datasets, supporting PyTorch and TensorFlow 2.x.☆11Aug 29, 2021Updated 4 years ago
- Codes for our paper "Enhancing Continual Relation Extraction via Classifier Decomposition" (Findings of ACL2023)☆10Nov 29, 2023Updated 2 years ago