gingasan / delta-engineLinks
☆19Updated 10 months ago
Alternatives and similar repositories for delta-engine
Users that are interested in delta-engine are comparing it to the libraries listed below
Sorting:
- The Code Repo for Agent-Pro: Learning to Evolve via Policy-Level Reflection and Optimization☆124Updated last year
- ☆119Updated last year
- A Framework for Decoupling and Assessing the Capabilities of VLMs☆43Updated last year
- Web2Code: A Large-scale Webpage-to-Code Dataset and Evaluation Framework for Multimodal LLMs☆92Updated last year
- ☆17Updated 2 years ago
- [ICML 2025] |TokenSwift: Lossless Acceleration of Ultra Long Sequence Generation☆118Updated 5 months ago
- ☆116Updated 7 months ago
- Reformatted Alignment☆112Updated last year
- ☆36Updated last year
- [NeurIPS 2024] Official Implementation for Optimus-1: Hybrid Multimodal Memory Empowered Agents Excel in Long-Horizon Tasks☆87Updated 5 months ago
- Exploring the Limit of Outcome Reward for Learning Mathematical Reasoning☆190Updated 7 months ago
- Official Implementation of ARPO: End-to-End Policy Optimization for GUI Agents with Experience Replay☆137Updated 5 months ago
- Official completion of “Training on the Benchmark Is Not All You Need”.☆37Updated 10 months ago
- The official repository of the Omni-MATH benchmark.☆88Updated 10 months ago
- Official Code for "Coser: Coordinating LLM-Based Persona Simulation of Established Roles"☆140Updated 4 months ago
- ☆46Updated 5 months ago
- ☆83Updated last year
- G1: Bootstrapping Perception and Reasoning Abilities of Vision-Language Model via Reinforcement Learning☆88Updated 5 months ago
- [NeurIPS 2024] OlympicArena: Benchmarking Multi-discipline Cognitive Reasoning for Superintelligent AI☆108Updated 8 months ago
- Recent advancements propelled by large language models (LLMs), encompassing an array of domains including Vision, Audio, Agent, Robotics,…☆123Updated 5 months ago
- MiroTrain is an efficient and algorithm-first framework for post-training large agentic models.☆91Updated 2 months ago
- ☆98Updated 3 months ago
- [ACL2025 Findings] Benchmarking Multihop Multimodal Internet Agents☆47Updated 8 months ago
- Code for Paper: Autonomous Evaluation and Refinement of Digital Agents [COLM 2024]☆147Updated 11 months ago
- [ACL 2024 Findings] MathBench: A Comprehensive Multi-Level Difficulty Mathematics Evaluation Dataset☆109Updated 5 months ago
- ☆170Updated this week
- This is the official implementation of "Progressive-Hint Prompting Improves Reasoning in Large Language Models"☆209Updated 2 years ago
- xVerify: Efficient Answer Verifier for Reasoning Model Evaluations☆138Updated 7 months ago
- Unleashing the Power of Cognitive Dynamics on Large Language Models☆63Updated last year
- ☆61Updated 3 weeks ago