CL-bench: A Benchmark for Context Learning
☆423Feb 8, 2026Updated 3 weeks ago
Alternatives and similar repositories for CL-bench
Users that are interested in CL-bench are comparing it to the libraries listed below
Sorting:
- [ICML 2025] Closed-Loop Long-Horizon Robotic Planning via Equilibrium Sequence Modeling☆12May 5, 2025Updated 9 months ago
- TIER: Text-Image Encoder-based Regression for AIGC Image Quality Assessment☆10Mar 1, 2025Updated last year
- [NeurIPS 2023] and [ICLR 2024] for robustness certification.☆10Nov 30, 2024Updated last year
- ☆13Feb 11, 2019Updated 7 years ago
- ☆32Feb 4, 2026Updated 3 weeks ago
- A Shortcut-aware Video-QA Benchmark for Physical Understanding via Minimal Video Pairs☆35Sep 22, 2025Updated 5 months ago
- Mitigating Spurious Correlations in Multi-modal Models during Fine-tuning (ICML 2023)☆19Dec 15, 2023Updated 2 years ago
- GenEnv: Difficulty-Aligned Co-Evolution Between LLM Agents and Environment Simulators☆47Dec 23, 2025Updated 2 months ago
- Official implementation for "ALI-Agent: Assessing LLMs'Alignment with Human Values via Agent-based Evaluation"☆21Jan 31, 2026Updated last month
- ☆19Mar 10, 2025Updated 11 months ago
- This repository contains the code and data for the paper "Chaining the Evidence: Robust Reinforcement Learning for Deep Search Agents wit…☆55Feb 7, 2026Updated 3 weeks ago
- ☆352Jul 29, 2025Updated 7 months ago
- Learning on the Job: An Experience-Driven, Self-Evolving Agent for Long-Horizon Tasks☆81Oct 16, 2025Updated 4 months ago
- [NeurIPS DB 2025] IR3D-Bench: Evaluating Vision-Language Model Scene Understanding as Agentic Inverse Rendering☆44Oct 15, 2025Updated 4 months ago
- Official PyTorch implementation of "Scaling Up Personalized Image Aesthetic Assessment via Task Vector Customization" (ECCV 2024)☆32Mar 10, 2025Updated 11 months ago
- Awesome multi-modal large language paper/project, collections of popular training strategies, e.g., PEFT, LoRA.☆26Aug 2, 2024Updated last year
- ☆37May 15, 2025Updated 9 months ago
- ☆67Feb 13, 2026Updated 2 weeks ago
- UniFork: Exploring Modality Alignment for Unified Multimodal Understanding and Generation☆46Aug 26, 2025Updated 6 months ago
- In-Context Sharpness as Alerts: An Inner Representation Perspective for Hallucination Mitigation (ICML 2024)☆62Mar 30, 2024Updated last year
- Target-oriented Proactive Dialogue Systems with Personalization: Problem Formulation and Dataset Curation (EMNLP 2023)☆31Oct 18, 2025Updated 4 months ago
- CutDiffusion: A Simple, Fast, Cheap, and Strong Diffusion Extrapolation Method☆27Oct 9, 2025Updated 4 months ago
- Implementation of GradLoc from the Tencent Hunyuan blog "Stabilizing RLVR via Token-level Gradient Diagnosis and Layerwise Clipping".☆36Feb 16, 2026Updated 2 weeks ago
- Official Implementation for the paper "Integrative Decoding: Improving Factuality via Implicit Self-consistency"☆32Apr 12, 2025Updated 10 months ago
- ☆28Oct 21, 2019Updated 6 years ago
- ECCV 2024☆26Oct 24, 2024Updated last year
- Official Implementation of SAM-Decoding: Speculative Decoding via Suffix Automaton☆42Feb 13, 2025Updated last year
- [COLM 2025] SEAL: Steerable Reasoning Calibration of Large Language Models for Free☆52Apr 6, 2025Updated 10 months ago
- Evaluate the Quality of Critique☆36Jun 1, 2024Updated last year
- BeHonest: Benchmarking Honesty in Large Language Models☆34Aug 15, 2024Updated last year
- A library for developing and applying Seldonian algorithms☆12Jan 13, 2024Updated 2 years ago
- 全国大学生软件测试大赛题库(2016~2024),包含国际赛和国内赛全过程题目,学生自行整理,存在缺漏☆14Nov 17, 2024Updated last year
- ☆11Mar 11, 2024Updated last year
- ☆10Apr 26, 2023Updated 2 years ago
- This repo contains the code to reproduce figures in my dissertation "Passive Imaging and Characterization of the Subsurface With Distribu…☆10Jun 14, 2018Updated 7 years ago
- Sparkles: Unlocking Chats Across Multiple Images for Multimodal Instruction-Following Models☆45Jun 14, 2024Updated last year
- [DMLR 2024] Benchmarking Robustness of Multimodal Image-Text Models under Distribution Shift☆38Jan 25, 2024Updated 2 years ago
- ☆41Nov 30, 2023Updated 2 years ago
- [Advanced Photonics Research, 2021] Control tightly focused fields via manipulating pupil functions☆10Dec 25, 2024Updated last year