codalab / codabenchLinks

Codabench is a flexible, easy-to-use and reproducible benchmarking platform. Check our paper at Patterns Cell Press https://hubs.li/Q01fwRWB0

☆122

Alternatives and similar repositories for codabench

Users that are interested in codabench are comparing it to the libraries listed below

Sorting:

catherinesyeh / attention-viz
Visualizing query-key interactions in language + vision transformers (VIS 2023)
☆156Updated last year
Modalities / modalities
Modalities, a PyTorch-native framework for distributed and reproducible foundation model training.
☆91Updated this week
awesome-mlops / awesome-ml-experiment-management
A curated list of awesome open source tools and commercial products for ML Experiment Tracking and Management 🚀
☆149Updated last year
google-deepmind / codoc
☆117Updated 2 years ago
pranftw / openreview_scraper
Scrape papers from OpenReview using OpenReview API
☆54Updated 8 months ago
mlfoundations / rtfm
Research on Tabular Foundation Models
☆60Updated 11 months ago
google-research / optformer
☆230Updated last week
lucidrains / AMIE-pytorch
Implementation of the general framework for AMIE, from the paper "Towards Conversational Diagnostic AI", out of Google Deepmind
☆68Updated last year
openreview / openreview-py
Official Python client library for the OpenReview API
☆211Updated this week
pietrobarbiero / pytorch_explain
PyTorch Explain: Interpretable Deep Learning in Python.
☆164Updated last year
facebookresearch / pal
PAL: Predictive Analysis & Laws of Large Language Models
☆38Updated 10 months ago
HazyResearch / domino
☆139Updated 2 years ago
allenai / discoverybench
Discovering Data-driven Hypotheses in the Wild
☆118Updated 5 months ago
KatherLab / ToolMaker
Turn GitHub repositories into LLM tools. (ACL 2025)
☆55Updated 6 months ago
zzachw / llemr
NeurIPS'24 DB (Spotlight) | Instruction Tuning Large Language Models to Understand Electronic Health Records
☆51Updated 2 months ago
facebookresearch / nbm-spam
Training and evaluating NBM and SPAM for interpretable machine learning.
☆78Updated 2 years ago
UW-Madison-Lee-Lab / LanguageInterfacedFineTuning
Code for Language-Interfaced FineTuning for Non-Language Machine Learning Tasks.
☆132Updated last year
google-research / lanistr
☆69Updated 3 months ago
facebookresearch / LAWT
Code for papers Linear Algebra with Transformers (TMLR) and What is my Math Transformer Doing? (AI for Maths Workshop, Neurips 2022)
☆76Updated last year
naszilla / tabzilla
Code for "TabZilla: When Do Neural Nets Outperform Boosted Trees on Tabular Data?"
☆172Updated last year
noahho / CAAFE
Semi-automatic feature engineering process using Language Models and your dataset descriptions. Based on the paper "LLMs for Semi-Automat…
☆177Updated 11 months ago
mmcdermott / EventStreamGPT
Dataset and modelling infrastructure for modelling "event streams": sequences of continuous time, multivariate events with complex intern…
☆116Updated 4 months ago
andylolu2 / ollm
End-to-End Ontology Learning with Large Language Models, NeurIPS 2024.
☆37Updated last year
interpretml / LLM-Tabular-Memorization-Checker
Testing Language Models for Memorization of Tabular Datasets.
☆36Updated 9 months ago
dilyabareeva / quanda
A toolkit for quantitative evaluation of data attribution methods.
☆54Updated 4 months ago
MadryLab / modelcomponents
Decomposing and Editing Predictions by Modeling Model Computation
☆138Updated last year
m42-health / med42
☆56Updated 2 years ago
abacusai / xai-bench
XAI-Bench is a library for benchmarking feature attribution explainability techniques
☆70Updated 2 years ago
canyuchen / ClinicalBench
Code for the paper "ClinicalBench: Can LLMs Beat Traditional ML Models in Clinical Prediction?"
☆29Updated 5 months ago
mkirchhof / url
Uncertainty-aware representation learning (URL) benchmark
☆105Updated 8 months ago