☆60Apr 2, 2025Updated 11 months ago
Alternatives and similar repositories for liveswebench
Users that are interested in liveswebench are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [ACL25] FEA-Bench: A Benchmark for Evaluating Repository-Level Code Generation for Feature Implementation☆48Jan 28, 2026Updated 2 months ago
- Experiments for "A Closer Look at In-Context Learning under Distribution Shifts"☆19May 29, 2023Updated 2 years ago
- Code and models for EMNLP 2024 paper "WPO: Enhancing RLHF with Weighted Preference Optimization"☆41Sep 24, 2024Updated last year
- ☆81Dec 5, 2024Updated last year
- LiveBench: A Challenging, Contamination-Free LLM Benchmark☆1,108Mar 23, 2026Updated last week
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- ☆16Mar 30, 2024Updated 2 years ago
- Code for EMNLP 2022 paper “Distilled Dual-Encoder Model for Vision-Language Understanding”☆31May 1, 2023Updated 2 years ago
- Terraform modules and Ansible playbook for Apache SkyWalking☆12Mar 11, 2024Updated 2 years ago
- Kernel Playground - A playground to run large scale experiments on the Linux Kernel☆18Nov 8, 2025Updated 4 months ago
- musl: A C standard library☆16Feb 26, 2026Updated last month
- Experiment with evolution strats☆16Jun 16, 2017Updated 8 years ago
- ☆13May 12, 2025Updated 10 months ago
- Interaction Compass: Multi-Label Zero-Shot Learning of Human-Object Interactions via Spatial Relations @ ICCV21☆13Jul 15, 2022Updated 3 years ago
- ☆17Feb 9, 2026Updated last month
- NordVPN Special Discount Offer • AdSave on top-rated NordVPN 1 or 2-year plans with secure browsing, privacy protection, and support for for all major platforms.
- The code and datasets of our ACM MM 2024 paper "Hallu-PI: Evaluating Hallucination in Multi-modal Large Language Models within Perturbed …☆11Sep 27, 2024Updated last year
- ☆25Feb 23, 2026Updated last month
- GitHub actions to build wheels for nogil Python☆14Apr 8, 2024Updated last year
- CodeElo: Benchmarking Competition-level Code Generation of LLMs with Human-comparable Elo Ratings☆68Feb 3, 2025Updated last year
- Instruction Following Eval☆16Jan 16, 2025Updated last year
- An implementation of "Subspace Representations for Soft Set Operations and Sentence Similarities" (NAACL 2024)☆10May 31, 2024Updated last year
- Open-Retrieval Conversational Machine Reading: A new setting & OR-ShARC dataset☆13Nov 19, 2022Updated 3 years ago
- Code for the ICCV 2023 paper "Benchmarking Low-Shot Robustness to Natural Distribution Shifts"☆11Jan 21, 2024Updated 2 years ago
- Benchmarking LLM Inference Speeds☆13Mar 3, 2026Updated 3 weeks ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- ☆12Feb 22, 2021Updated 5 years ago
- Cog wrapper for playgroundai/playground-v2.5-1024px-aesthetic☆17Nov 25, 2024Updated last year
- ☆11Nov 12, 2024Updated last year
- Chainer and PyTorch implementation of GAN with gradient reversal layer☆10Mar 19, 2022Updated 4 years ago
- 云原生社区可观察性 SIG。☆11Mar 22, 2021Updated 5 years ago
- This is for EMNLP 2024 Paper: AppBench: Planning of Multiple APIs from Various APPs for Complex User Instruction☆15Nov 4, 2024Updated last year
- Code for Paper: Training Software Engineering Agents and Verifiers with SWE-Gym [ICML 2025]☆651Jul 29, 2025Updated 8 months ago
- TACS: Taxonomy Adaptive Cross-Domain Semantic Segmentation☆12Jul 14, 2022Updated 3 years ago
- ☆17Nov 26, 2023Updated 2 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- ☆13Sep 2, 2021Updated 4 years ago
- sppringboot demo☆12Jul 22, 2023Updated 2 years ago
- Pytorch implementation of Tree Preference Optimization (TPO) (Accepted by ICLR'25)☆26Apr 24, 2025Updated 11 months ago
- [EMNLP 2025] Verification Engineering for RL in Instruction Following☆53Jan 5, 2026Updated 2 months ago
- Forecr Linux Kernel for Jetson Xavier, Xavier NX, Orin, Orin NX and Orin Nano based products☆12Mar 18, 2026Updated last week
- [ACL 2023] Code and data for our paper "Measuring Progress in Fine-grained Vision-and-Language Understanding"☆13Jun 11, 2023Updated 2 years ago
- PaStiX (Parallel Sparse matriX package) solver library☆20Nov 20, 2018Updated 7 years ago