☆61Apr 2, 2025Updated 11 months ago
Alternatives and similar repositories for liveswebench
Users that are interested in liveswebench are comparing it to the libraries listed below
Sorting:
- Experiments for "A Closer Look at In-Context Learning under Distribution Shifts"☆19May 29, 2023Updated 2 years ago
- [ACL25] FEA-Bench: A Benchmark for Evaluating Repository-Level Code Generation for Feature Implementation☆46Jan 28, 2026Updated last month
- The repository for ACL 2024 paper "TimeBench: A Comprehensive Evaluation of Temporal Reasoning Abilities in Large Language Models"☆34Jun 29, 2024Updated last year
- ☆13Oct 5, 2025Updated 5 months ago
- Code for EMNLP 2022 paper “Distilled Dual-Encoder Model for Vision-Language Understanding”☆31May 1, 2023Updated 2 years ago
- ☆12Feb 22, 2021Updated 5 years ago
- Python GUI for differential forms☆13Oct 14, 2023Updated 2 years ago
- Code for "Zero-Shot Out-of-Distribution Detection with Feature Correlations"☆13Jan 19, 2020Updated 6 years ago
- Python powered music controlling webpage with websockets and bottle py (works with spotify, vlc, audacious, and others)☆11Jun 9, 2017Updated 8 years ago
- OSWorld-Human: Benchmarking the Efficiency of Computer-Use Agents☆21Jan 6, 2026Updated 2 months ago
- Kernel Playground - A playground to run large scale experiments on the Linux Kernel☆17Nov 8, 2025Updated 4 months ago
- Code and models for EMNLP 2024 paper "WPO: Enhancing RLHF with Weighted Preference Optimization"☆41Sep 24, 2024Updated last year
- Chainer and PyTorch implementation of GAN with gradient reversal layer☆10Mar 19, 2022Updated 3 years ago
- TACS: Taxonomy Adaptive Cross-Domain Semantic Segmentation☆12Jul 14, 2022Updated 3 years ago
- Graph representations of text☆13Sep 20, 2023Updated 2 years ago
- ☆11Nov 12, 2024Updated last year
- PureScript + D3 examples☆13Oct 11, 2016Updated 9 years ago
- Lucene open-domain QA retrieval in python☆11Feb 18, 2021Updated 5 years ago
- distill large scale web page text☆12Jul 29, 2023Updated 2 years ago
- ☆13Sep 2, 2021Updated 4 years ago
- VertMetric: An abstractive summarization evaluation package. VERT stands for Versatile Evaluation of Reduced Texts.☆11Dec 20, 2018Updated 7 years ago
- ☆11Oct 26, 2021Updated 4 years ago
- ☆10May 20, 2019Updated 6 years ago
- MHW Animation Importer and Exporter Plugin for Blender 2.79☆14Aug 31, 2024Updated last year
- python module for connectivity analysis☆10Nov 2, 2024Updated last year
- ☆23Feb 23, 2026Updated 2 weeks ago
- A morphosyntactic tagger for Polish based on conditional random fields☆22Apr 6, 2021Updated 4 years ago
- Natural Perturbation for Robust Question Answering☆12Apr 7, 2020Updated 5 years ago
- An implementation of "Subspace Representations for Soft Set Operations and Sentence Similarities" (NAACL 2024)☆10May 31, 2024Updated last year
- Code for ACL 2023 Oral Paper: ManagerTower: Aggregating the Insights of Uni-Modal Experts for Vision-Language Representation Learning☆12Aug 23, 2025Updated 6 months ago
- Forecr Linux Kernel for Jetson Xavier, Xavier NX, Orin, Orin NX and Orin Nano based products☆12Updated this week
- daVinci-Agency: Unlocking Long-Horizon Agency Data-Efficiently☆33Feb 4, 2026Updated last month
- LGEB: Benchmark of Language Generation Evaluation☆16Oct 21, 2022Updated 3 years ago
- A repository for the updated version of CoinRun used to collect MUGEN, a multimodal video-audio-text dataset. This repo contains scripts …☆13Jul 13, 2022Updated 3 years ago
- ☆17Feb 9, 2026Updated last month
- ICCV 2023 - AdaptGuard: Defending Against Universal Attacks for Model Adaptation☆11Dec 23, 2023Updated 2 years ago
- A collection of plugin for j4status☆14Oct 2, 2023Updated 2 years ago
- [obsolete] Python interface to Morfeusz☆10Jul 3, 2017Updated 8 years ago
- CodeElo: Benchmarking Competition-level Code Generation of LLMs with Human-comparable Elo Ratings☆66Feb 3, 2025Updated last year