zli12321 / qa_metricsView external linksLinks
An easy python package to run quick basic QA evaluations. This package includes standardized QA evaluation metrics and semantic evaluation metrics: Black-box and Open-Source large language model prompting and evaluation, exact match, F1 Score, PEDANT semantic match, transformer match. Our package also supports prompting OPENAI and Anthropic API.
☆60Jul 18, 2025Updated 6 months ago
Alternatives and similar repositories for qa_metrics
Users that are interested in qa_metrics are comparing it to the libraries listed below
Sorting:
- Reinforcement Learning of Vision Language Models with Self Visual Perception Reward☆160Sep 23, 2025Updated 4 months ago
- Butler 是一个用于自动化服务管理和任务调度的工具项目。☆15Updated this week
- 🌏 UI component library for the future, based on WebComponent.☆23Nov 12, 2024Updated last year
- Making a bridge between NLP models and Brain data☆18Jun 3, 2020Updated 5 years ago
- TCM Lingdan LLM☆45Nov 3, 2024Updated last year
- [ICLR 2025] A Closer Look at Machine Unlearning for Large Language Models☆44Dec 4, 2024Updated last year
- ☆44Mar 3, 2023Updated 2 years ago
- A toolkit for building dense retrievers with deep language models.☆64Sep 24, 2021Updated 4 years ago
- TyDiP Multilingual Politeness dataset and code☆12Oct 15, 2023Updated 2 years ago
- Dataset of ML and NLP papers☆34Aug 17, 2022Updated 3 years ago
- Implementation of various handwritten text line segmentation☆10Jan 6, 2020Updated 6 years ago
- Open-source Human Feedback Library☆11Oct 25, 2023Updated 2 years ago
- Residual Quantization Autoencoder, used for interpreting LLMs☆14Jan 1, 2025Updated last year
- Official Repository of RefChartQA: Grounding Visual Answer on Chart Images through Instruction Tuning☆14Jul 9, 2025Updated 7 months ago
- ChiMed-GPT is a Chinese medical large language model (LLM) built by continually training Ziya-v2 on Chinese medical data, where pre-train…☆104Dec 29, 2023Updated 2 years ago
- ☆12Oct 21, 2023Updated 2 years ago
- ☆12Dec 15, 2022Updated 3 years ago
- ☆13May 7, 2023Updated 2 years ago
- [ICPR-2024] S-MultiMAE - A Multi-Ground Truth approach for RGB-D Saliency Detection☆12Dec 13, 2024Updated last year
- Collection of Common Machine Translation Tools☆11Jul 26, 2022Updated 3 years ago
- Compression primitives for uplink compression in Federated Learning that are compatible with Secure Aggregation.☆10Jul 27, 2022Updated 3 years ago
- MATLAB implementation of the paper "Distributed Optimization of Average Consensus Containment with Multiple Stationary Leaders" [arXiv 20…☆15Aug 9, 2022Updated 3 years ago
- NEAL (Nature+Energy Audio Labeller) is an open-source interactive audio data annotation tool.☆16Apr 7, 2025Updated 10 months ago
- A curated list of resources on Document Layout Analysis☆11Aug 7, 2025Updated 6 months ago
- PyTorch implementation of FAIR's paper "End-to-End Memory Network", NIPS 2015☆12Oct 19, 2017Updated 8 years ago
- ☆11Mar 13, 2023Updated 2 years ago
- MathNet: A Data-Centric Approach, Dataset and Benchmark Model to Advance Mathematical Expression Recognition☆10Mar 19, 2025Updated 10 months ago
- STRExp is a framework that provides Explainability (XAI) to Scene Text Recognition (STR) models.☆11Nov 27, 2023Updated 2 years ago
- HarmAug: Effective Data Augmentation for Knowledge Distillation of Safety Guard Models☆13Mar 6, 2025Updated 11 months ago
- Implementation of the spotlight: a method for discovering systematic errors in deep learning models☆11Oct 5, 2021Updated 4 years ago
- Cluster paraphrases by word sense☆12Jan 3, 2019Updated 7 years ago
- Implementation of the DocLLM paper for Llama models.☆13Apr 6, 2025Updated 10 months ago
- Official implementation for AAAI 2025 paper: SSAN: A Symbol Spatial-Aware Network for Handwritten Mathematical Expression Recognition☆15Jan 21, 2025Updated last year
- Official code for the paper: "Perception and Semantic Aware Regularization for Sequential Confidence Calibration (CVPR2023)"☆10May 15, 2024Updated last year
- [WACV2025] source code of StrDA: https://arxiv.org/abs/2410.09913☆12Apr 15, 2025Updated 9 months ago
- John Langford's original release of Vowpal Wabbit -- a fast online learning algorithm☆16Jul 25, 2017Updated 8 years ago
- Dead simple Linear Kalman Filter. Contains 2-D based tracker☆12Dec 13, 2025Updated 2 months ago
- a decentralized dataset generator and manipulator.☆13Updated this week
- DEPRECATED: Tool for checking data leaks of social media platforms☆10Feb 20, 2022Updated 3 years ago