codalab / codabench
Codabench is a flexible, easy-to-use and reproducible benchmarking platform. Check our paper at Patterns Cell Press https://hubs.li/Q01fwRWB0
☆85Updated this week
Alternatives and similar repositories for codabench:
Users that are interested in codabench are comparing it to the libraries listed below
- Official Python client library for the OpenReview API☆172Updated this week
- Discovering Data-driven Hypotheses in the Wild☆65Updated 4 months ago
- ☆34Updated 3 months ago
- Code for NeurIPS 2024 Spotlight: "Scaling Laws and Compute-Optimal Training Beyond Fixed Training Durations"☆71Updated 5 months ago
- ☆145Updated last year
- Experiments and code to generate the GINC small-scale in-context learning dataset from "An Explanation for In-context Learning as Implici…☆104Updated last year
- A curated list of awesome open source tools and commercial products for ML Experiment Tracking and Management 🚀☆126Updated 8 months ago
- OpenReivew Submission Visualization (ICLR 2024/2025)☆152Updated 5 months ago
- The code implementation of MAGDi: Structured Distillation of Multi-Agent Interaction Graphs Improves Reasoning in Smaller Language Models…☆33Updated last year
- ☆117Updated 2 months ago
- A list of awesome neural symbolic papers.☆47Updated 2 years ago
- Language models scale reliably with over-training and on downstream tasks☆96Updated 11 months ago
- Skill-It! A Data-Driven Skills Framework for Understanding and Training Language Models☆45Updated last year
- ☆61Updated 2 years ago
- Dataset and evaluation suite enabling LLM instruction-following for scientific literature understanding.☆37Updated last week
- Are LLMs Capable of Data-based Statistical and Causal Reasoning? Benchmarking Advanced Quantitative Reasoning with Data☆35Updated last month
- Scrape papers from OpenReview using OpenReview API☆32Updated 3 weeks ago
- Code Release for "Broken Neural Scaling Laws" (BNSL) paper☆58Updated last year
- The accompanying code for "Transformer Feed-Forward Layers Are Key-Value Memories". Mor Geva, Roei Schuster, Jonathan Berant, and Omer Le…☆90Updated 3 years ago
- NeurIPS Large Language Model Efficiency Challenge: 1 LLM + 1GPU + 1Day☆255Updated last year
- ☆29Updated 10 months ago
- Code for Benchmarking Language Model Agents for Data-Driven Science☆24Updated 5 months ago
- A collection of AWESOME language modeling techniques on tabular data applications.☆29Updated 5 months ago
- ☆137Updated last year
- [NeurIPS 2024] A Novel Rank-Based Metric for Evaluating Large Language Models☆43Updated 4 months ago
- Google Research☆46Updated 2 years ago
- ☆13Updated last year
- ☆12Updated 3 years ago
- Modalities, a PyTorch-native framework for distributed and reproducible foundation model training.☆74Updated this week
- Building modular LMs with parameter-efficient fine-tuning.☆98Updated this week