xingyuanbu/opencompass

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/xingyuanbu/opencompass)

xingyuanbu / opencompass

OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.

☆19

Alternatives and similar repositories for opencompass

Users that are interested in opencompass are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

mtbench101 / mt-bench-101
View on GitHub
[ACL 2024] MT-Bench-101: A Fine-Grained Benchmark for Evaluating Large Language Models in Multi-Turn Dialogues
☆152Jul 24, 2024Updated 2 years ago
tbh-98 / Hypergraph-MLP
View on GitHub
☆20Jan 9, 2024Updated 2 years ago
mfandre / GanttEcharts
View on GitHub
Gantt Chart using echarts
☆13Mar 31, 2021Updated 5 years ago
ErikZ719 / CoTA
View on GitHub
[ICLR 26] Context Tokens are Anchors: Understanding the Repeat Curse in dMLLMs from an Information Flow Perspective
☆16Mar 6, 2026Updated 4 months ago
AndreasMadsen / nlp-roar-interpretability
View on GitHub
Measuring if attention is explanation with ROAR
☆22Mar 3, 2023Updated 3 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
wanghangpsu / MM-BD
View on GitHub
The implementation of the IEEE S&P 2024 paper MM-BD: Post-Training Detection of Backdoor Attacks with Arbitrary Backdoor Pattern Types Us…
☆16May 12, 2024Updated 2 years ago
AI-secure / adversarial-glue
View on GitHub
[NeurIPS 2021] "Adversarial GLUE: A Multi-Task Benchmark for Robustness Evaluation of Language Models" by Boxin Wang*, Chejian Xu*, Shuoh…
☆13Apr 3, 2023Updated 3 years ago
wwh0411 / MCP-Flow
View on GitHub
[ACL 2026 Main] MCP-Flow: Facilitating LLM Agents to Master Real-World, Diverse and Scaling MCP Tools.
☆25Apr 8, 2026Updated 3 months ago
uqwhua / TrjPrivacy
View on GitHub
TKDE'23: A Survey and Experimental Study on Privacy-Preserving Trajectory Data Publishing
☆12May 5, 2023Updated 3 years ago
UKPLab / acl2022-impli
View on GitHub
☆13Mar 15, 2022Updated 4 years ago
icip-cas / MemSearcher
View on GitHub
MemSearcher is a search agent that keeps a compact, iteratively-updated memory instead of the full interaction history, trained end-to-en…
☆27Jun 29, 2026Updated 3 weeks ago
eholk / bench-dot-product
View on GitHub
Several variations of a dot product benchmark.
☆11Dec 10, 2012Updated 13 years ago
keyurfaldu / AIgrads
View on GitHub
This is a niche collection of research papers which are proven to be gradients pushing the field of Natural Language Processing, Deep Lea…
☆25Nov 19, 2024Updated last year
JCruan519 / GIST
View on GitHub
(ACM MM24) This is the offical repository of GIST: Improving Parameter Efficient Fine Tuning via Knowledge Interaction.
☆11Jan 28, 2024Updated 2 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
CGCL-codes / TeCo
View on GitHub
[CVPR 2023] The official implementation of our CVPR 2023 paper "Detecting Backdoors During the Inference Stage Based on Corruption Robust…
☆25May 25, 2023Updated 3 years ago
imcjp / DPTrajExperiments
View on GitHub
本项目实现自《差分隐私下满足一致性的轨迹流量发布方法》，作者蔡剑平
☆12Sep 2, 2019Updated 6 years ago
bojone / LST-CLUE
View on GitHub
Ladder Side-Tuning在CLUE上的简单尝试
☆23Jun 20, 2022Updated 4 years ago
JunfengGo / SCALE-UP
View on GitHub
☆29Jun 17, 2024Updated 2 years ago
BinWang28 / FacEval
View on GitHub
EMNLP 2022: Analyzing and Evaluating Faithfulness in Dialogue Summarization
☆13Mar 20, 2025Updated last year
isle-dev / MetricEval
View on GitHub
MetricEval: A framework that conceptualizes and operationalizes four main components of metric evaluation, in terms of reliability and va…
☆12Nov 6, 2023Updated 2 years ago
zhongwanjun / ProQA
View on GitHub
The code for paper "ProQA: Structural Prompt-based Pre-training for Unified Question Answering"
☆11Feb 7, 2023Updated 3 years ago
zhiao777774 / awesome-personalized-lm
View on GitHub
A curated list of personalized Language model / Large language model (continually updated)
☆10Nov 17, 2023Updated 2 years ago
Aman-4-Real / CrEval
View on GitHub
[ICLR 2026] Evaluating Text Creativity across Diverse Domains: A Dataset and a Large Language Model Evaluator
☆18Feb 28, 2026Updated 4 months ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
lv2020 / EBM
View on GitHub
LBSN based on foursquare dataset
☆14Apr 26, 2019Updated 7 years ago
LanD-FBK / benchmark-gen-explanations
View on GitHub
Codes for "Benchmarking the Generation of Fact Checking Explanations"
☆10Aug 16, 2024Updated last year
yuhuifishash / SysY
View on GitHub
2024编译系统实现赛RISC-V赛道一等奖作品(A compiler of SysY (subset of C) )
☆24Sep 4, 2024Updated last year
prokolyvakis / deep-align
View on GitHub
This repository contains our implementation of the ontology matching framework based on representation learning.
☆15May 7, 2018Updated 8 years ago
JiayuJeff / PlanBench-XL
View on GitHub
Official Repository for our paper: PlanBench-XL: Evaluating Long-Horizon Planning of LLM Tool-Use Agents in Large-Scale Tool Ecosystems
☆38Jul 16, 2026Updated last week
JCruan519 / iDAT
View on GitHub
(ICME24) This is the offical repository of iDAT: inverse Distillation Adapter-Tuning.
☆13Apr 3, 2024Updated 2 years ago
vicgalle / refined-dpo
View on GitHub
Refined Direct Preference Optimization with Synthetic Data for Behavioral Alignment of LLMs
☆13Feb 13, 2024Updated 2 years ago
tsinghua-fib-lab / PateGail
View on GitHub
☆17Aug 6, 2023Updated 2 years ago
BKHMSI / cultural-trends
View on GitHub
Investigating Cultural Alignment of Large Language Models
☆13Aug 14, 2024Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
xbmxb / StructureCharacterization4DD
View on GitHub
https://openreview.net/forum?id=OC1o4_OI6Jw
☆13May 27, 2022Updated 4 years ago
NLP2CT / ua-cl-nmt
View on GitHub
Uncertainty-Aware Curriculum Learning for Neural Machine Translation (ACL 2020)
☆11Jun 12, 2020Updated 6 years ago
lemon0830 / promptCSE
View on GitHub
code for promptCSE, emnlp 2022
☆11Apr 10, 2023Updated 3 years ago
xiaolin-cs / BackTime
View on GitHub
BackTime: Backdoor Attacks on Multivariate Time Series Forecasting
☆32Apr 14, 2025Updated last year
amazon-science / faithful-summarization-generation
View on GitHub
☆16Mar 27, 2023Updated 3 years ago
chenzl23 / FZUThesis
View on GitHub
福州大学博士研究生毕业论文Latex模板
☆18May 27, 2024Updated 2 years ago
cambridgeltl / zepo
View on GitHub
Fairer Preferences Elicit Improved Human-Aligned Large Language Model Judgments (Zhou et al., EMNLP 2024)
☆14Oct 3, 2024Updated last year