IAAR-Shanghai/NewsBench

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/IAAR-Shanghai/NewsBench)

IAAR-Shanghai / NewsBench

[ACL 2024 Main] NewsBench: A Systematic Evaluation Framework for Assessing Editorial Capabilities of Large Language Models in Chinese Journalism

☆34

Alternatives and similar repositories for NewsBench

Users that are interested in NewsBench are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

IAAR-Shanghai / Grimoire
View on GitHub
Grimoire is All You Need for Enhancing Large Language Models
☆120Feb 29, 2024Updated 2 years ago
IAAR-Shanghai / UHGEval
View on GitHub
[ACL 2024] User-friendly evaluation framework: Eval Suite & Benchmarks: UHGEval, HaluEval, HalluQA, etc.
☆180Jun 7, 2025Updated 11 months ago
IAAR-Shanghai / DATG
View on GitHub
[ACL 2024]Controlled Text Generation for Large Language Model with Dynamic Attribute Graphs
☆40Sep 24, 2024Updated last year
IAAR-Shanghai / CRUD_RAG
View on GitHub
CRUD-RAG: A Comprehensive Chinese Benchmark for Retrieval-Augmented Generation of Large Language Models
☆382May 20, 2025Updated last year
IAAR-Shanghai / xFinder
View on GitHub
[ICLR 2025] xFinder: Large Language Models as Automated Evaluators for Reliable Evaluation
☆177Nov 14, 2025Updated 6 months ago
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
IAAR-Shanghai / Awesome-Attention-Heads
View on GitHub
An awesome repository & A comprehensive survey on interpretability of LLM attention heads.
☆405Mar 2, 2025Updated last year
MemTensor / HaluMem
View on GitHub
HaluMem is the first operation level hallucination evaluation benchmark tailored to agent memory systems.
☆138Apr 30, 2026Updated 3 weeks ago
JasonForJoy / FIRE
View on GitHub
EMNLP 2020: Filtering before Iteratively Referring for Knowledge-Grounded Response Selection in Retrieval-Based Chatbots
☆12Dec 15, 2020Updated 5 years ago
hallogameboy / QDS-Transformer
View on GitHub
☆16Sep 28, 2020Updated 5 years ago
open-compass / GPassK
View on GitHub
[ACL 2025] Are Your LLMs Capable of Stable Reasoning?
☆33Aug 5, 2025Updated 9 months ago
thunlp / LEAD
View on GitHub
Enhancing Legal Case Retrieval via Scaling High-quality Synthetic Query-Candidate Pairs (EMNLP 2024)
☆16Nov 17, 2024Updated last year
NUST-Machine-Intelligence-Laboratory / SED
View on GitHub
A self-adaptive and class-balanced approach to improve deep neural network performance in the presence of noisy labels
☆18Jul 2, 2024Updated last year
thaolmk54 / LOGNet-VQA
View on GitHub
Implementation for the paper "Dynamic Language Binding in Relational Visual Reasoning" (Le et al., IJCAI 2020)
☆13Jul 25, 2024Updated last year
ScalingIntelligence / CATS
View on GitHub
☆33Nov 11, 2024Updated last year
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
InfiniTensor / learning-cxx
View on GitHub
☆29Jan 25, 2025Updated last year
AI4CTS / E2Usd
View on GitHub
Artifact evaluation for "E2Usd: Efficient-yet-effective Unsupervised State Detection for Multivariate Time Series" accepted by WWW'24
☆13Jul 29, 2024Updated last year
MemTensor / MemOS-Docs
View on GitHub
Documentation for the repository: https://github.com/MemTensor/MemOS
☆26Updated this week
wbfwonderful / Fed-WSVAD
View on GitHub
Official code for "Federated Weakly Supervised Video Anomaly Detection with Multimodal Prompt" (AAAI2025)
☆27May 27, 2025Updated last year
Jingtong0527 / RobuRCDet
View on GitHub
☆18Sep 10, 2025Updated 8 months ago
aurooj / WSG-VQA-VLTransformers
View on GitHub
Weakly Supervised Grounding for VQA in Vision-Language Transformers
☆16May 6, 2023Updated 3 years ago
kjason / SubspaceRepresentationLearning
View on GitHub
Subspace Representation Learning for Sparse Linear Arrays to Localize More Sources than Sensors: A Deep Learning Methodology
☆20Mar 18, 2025Updated last year
OpenMOSS / HalluQA
View on GitHub
Dataset and evaluation script for "Evaluating Hallucinations in Chinese Large Language Models"
☆139Jun 5, 2024Updated last year
rainorangelemon / GNN-Multi-Agent-Search
View on GitHub
The official repo for ICRA 2023 paper 'Accelerating Multi-Agent Planning Using Graph Transformers with Bounded Suboptimality'
☆20May 25, 2023Updated 3 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
yzhan238 / TELEClass
View on GitHub
The source code used for paper "TELEClass: Taxonomy Enrichment and LLM-Enhanced Hierarchical Text Classification with Minimal Supervision…
☆24Apr 6, 2025Updated last year
NJUNLP / Hallu-PI
View on GitHub
The code and datasets of our ACM MM 2024 paper "Hallu-PI: Evaluating Hallucination in Multi-modal Large Language Models within Perturbed …
☆11Sep 27, 2024Updated last year
josejg / instruction_following_eval
View on GitHub
Instruction Following Eval
☆17Jan 16, 2025Updated last year
Mia-YatingYu / STDD
View on GitHub
[AAAI'25]: Building a Multi-modal Spatiotemporal Expert for Zero-shot Action Recognition with CLIP
☆21Aug 5, 2025Updated 9 months ago
cipher982 / llm-benchmarks
View on GitHub
Benchmarking LLM Inference Speeds
☆13May 17, 2026Updated last week
maziao / T2I-Eval
View on GitHub
[ACL 2025 Main] Open-source toolkit for automatic evaluation of text-to-image generation task, including training & test datasets and a d…
☆20Jul 5, 2025Updated 10 months ago
lucataco / cog-playground-v2.5-1024px-aesthetic
View on GitHub
Cog wrapper for playgroundai/playground-v2.5-1024px-aesthetic
☆17Nov 25, 2024Updated last year
ErxinYu / CoSafe-Dataset
View on GitHub
☆11Nov 12, 2024Updated last year
shh1574 / multi-modal-dialogue-dataset
View on GitHub
☆22Aug 30, 2021Updated 4 years ago
End-to-end encrypted email - Proton Mail • Ad
Special offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
qwerty258 / libgbt28181
View on GitHub
I don't want to maintain this project, the code probably won't compile or run. Archived.
☆13Feb 25, 2024Updated 2 years ago
AiRyunn / BoT
View on GitHub
Implementation of "Bag of Tricks for Node Classification with Graph Neural Networks" based on DGL
☆35Jan 26, 2025Updated last year
CLUEbenchmark / SuperCLUE-Video
View on GitHub
中文原生多层次文生视频测评基准
☆18Jul 8, 2024Updated last year
wangcunxiang / QA-Eval
View on GitHub
The repository for paper <Evaluating Open-QA Evaluation>
☆25Apr 9, 2024Updated 2 years ago
Eathoublu / attention-is-all-you-need-pytorch
View on GitHub
A PyTorch implementation of the Transformer model in "Attention is All You Need".
☆19Aug 29, 2018Updated 7 years ago
MLNLP-World / Overleaf-Bib-Helper
View on GitHub
Enhances Overleaf by allowing article searches and BibTeX retrieval from DBLP and Google Scholar | 通过允许从 DBLP 和 Google Scholar 进行文章搜索和获取 …
☆127Feb 3, 2026Updated 3 months ago
SongW-SW / TENT
View on GitHub
☆19Nov 8, 2022Updated 3 years ago