CL-bench: A Benchmark for Context Learning
☆563May 12, 2026Updated last month
Alternatives and similar repositories for CL-bench
Users that are interested in CL-bench are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- LMAct: A Benchmark for In-Context Imitation Learning with Long Multimodal Demonstrations☆31May 21, 2025Updated last year
- ☆14Oct 28, 2023Updated 2 years ago
- ☆359Jul 29, 2025Updated 11 months ago
- Implementation of <Symbolic Graphics Programming with Large Language Models>☆38Sep 14, 2025Updated 9 months ago
- Thinking with Videos from Open-Source Priors. We reproduce chain-of-frames visual reasoning by fine-tuning open-source video models. Give…☆229Apr 13, 2026Updated 2 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- ☆19Mar 10, 2025Updated last year
- Project for SNARE benchmark☆11Jun 5, 2024Updated 2 years ago
- ☆15Jul 5, 2024Updated last year
- ☆134May 12, 2026Updated last month
- [ACL 2026 Oral] From Word to World: Can Large Language Models be Implicit Text-based World Models?☆63Apr 13, 2026Updated 2 months ago
- ☆31Feb 10, 2025Updated last year
- ☆44Feb 4, 2026Updated 4 months ago
- ☆15Dec 3, 2025Updated 6 months ago
- [ICML 2026] Orienting Latent Actions for Video World Modeling☆110Apr 20, 2026Updated 2 months ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- (ACL 2026 Findings) Learning on the Job: An Experience-Driven, Self-Evolving Agent for Long-Horizon Tasks☆93Oct 16, 2025Updated 8 months ago
- [ICLR 2026] The implementation of paper "AlphaSteer: Learning Refusal Steering with Principled Null-Space Constraint"☆60Nov 20, 2025Updated 7 months ago
- The official repo for "AceCoder: Acing Coder RL via Automated Test-Case Synthesis" [ACL25]☆101Apr 9, 2025Updated last year
- LLM evaluation on 2024 Chinese Gaokao Mathematics — zero-contamination benchmark with dual prompt formats☆21Apr 15, 2026Updated 2 months ago
- Code accompanying the NeurIPS 2019 paper AutoAssist: A Framework to Accelerate Training of Deep Neural Networks.☆14Oct 3, 2022Updated 3 years ago
- ☆16Dec 9, 2023Updated 2 years ago
- ☆41Dec 26, 2025Updated 6 months ago
- Official implementation for "ALI-Agent: Assessing LLMs'Alignment with Human Values via Agent-based Evaluation"☆21Jan 31, 2026Updated 5 months ago
- ☆32Oct 2, 2025Updated 9 months ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- ML4CO-Bench-101: Benchmark Machine Learning for Classic Combinatorial Problems on Graphs.☆47Nov 17, 2025Updated 7 months ago
- A simple Python wrapper for the ClearNLP constituents-to-dependencies converter☆11Nov 2, 2015Updated 10 years ago
- [NeurIPS 2025 D&B Track] MLR-Bench: Evaluating AI Agents on Open-Ended Machine Learning Research☆31May 8, 2026Updated last month
- ☆11Mar 22, 2024Updated 2 years ago
- ConvGQR: Generative Query Reformulation for Conversational Search. A codebase for ACL 2023 accepted paper.☆35Mar 5, 2024Updated 2 years ago
- [ICLR 2026] dParallel: Learnable Parallel Decoding for dLLMs☆65Apr 12, 2026Updated 2 months ago
- PeRL: Parameter-Efficient Reinforcement Learning☆81May 20, 2026Updated last month
- ☆23Nov 8, 2023Updated 2 years ago
- Official Implementation for the paper "Integrative Decoding: Improving Factuality via Implicit Self-consistency"☆33Apr 12, 2025Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Source code for PECRS (EACL 2024)☆12Feb 3, 2024Updated 2 years ago
- TIER: Text-Image Encoder-based Regression for AIGC Image Quality Assessment☆10Mar 1, 2025Updated last year
- Source code for the NAACL 2021 paper: "Distantly Supervised Relation Extraction with Sentence Reconstruction and Knowledge Base Priors"☆12Jul 15, 2021Updated 4 years ago
- [ECCV24] MixDQ: Memory-Efficient Few-Step Text-to-Image Diffusion Models with Metric-Decoupled Mixed Precision Quantization☆14Nov 27, 2024Updated last year
- [AAAI 2024] LLMEval Phase II dataset — professional domain evaluation across 12 academic disciplines☆71May 21, 2026Updated last month
- [COLM 2025] SEAL: Steerable Reasoning Calibration of Large Language Models for Free☆59Apr 6, 2025Updated last year
- Python Lib for Keenetic Routers☆12Aug 8, 2024Updated last year