☆38Feb 16, 2024Updated 2 years ago
Alternatives and similar repositories for LLM-evaluation-datasets
Users that are interested in LLM-evaluation-datasets are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- An Interactive Hex-Rays Microcode Explorer☆17Feb 8, 2024Updated 2 years ago
- ☆12Jul 8, 2022Updated 3 years ago
- Red Team AI Benchmark: Evaluating Uncensored LLMs for Offensive Security☆42Dec 25, 2025Updated 4 months ago
- [NAACL 2024 Findings] Deja vu: Contrastive Historical Modeling with Prefix-tuning for Temporal Knowledge Graph Reasoning☆15Jul 8, 2024Updated last year
- The official repository of NeurIPS'25 paper "Ada-R1: From Long-Cot to Hybrid-CoT via Bi-Level Adaptive Reasoning Optimization"☆23May 6, 2026Updated 2 weeks ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Binary Ninja deobfuscation plugin☆22Jul 23, 2025Updated 9 months ago
- The official repo for "OpenMoE 2: Sparse Diffusion Language Models".☆56Dec 28, 2025Updated 4 months ago
- LogicBench is a natural language question-answering dataset consisting of 25 different reasoning patterns spanning over propositional, fi…☆38May 2, 2024Updated 2 years ago
- frida脚本集合☆35Feb 6, 2026Updated 3 months ago
- ☆32Sep 13, 2024Updated last year
- 🥇 Amazon Nova AI Challenge Winner - ASTRA emerged victorious as the top attacking team in Amazon's global AI safety competition, defeati…☆71May 11, 2026Updated last week
- Code for the benchmarking single-cell foundation models (scGPT, scBERT, and Geneformer) for cell-type annotation task using skewed single…☆15Dec 8, 2024Updated last year
- Formally proving the security of Fast Reed-Solomon interactive oracle proofs of proximity☆90Dec 11, 2025Updated 5 months ago
- a secret detection tool☆40Mar 1, 2026Updated 2 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Extension for CoEdPilot☆21Feb 25, 2026Updated 2 months ago
- ios application class-dump use frida☆40Apr 28, 2023Updated 3 years ago
- 大创项目,层级注意力机器翻译☆17Apr 12, 2021Updated 5 years ago
- Phishing detection using GNNs (SECRYPT'22)☆15Jun 6, 2025Updated 11 months ago
- A polyglot static analysis engine for detecting vulnerabilities in scripting languages native extensions based on joern.☆21Sep 1, 2025Updated 8 months ago
- ☆30Dec 23, 2025Updated 4 months ago
- ☆30Aug 21, 2025Updated 9 months ago
- Clustering and Ranking: Diversity-preserved Instruction Selection through Expert-aligned Quality Estimation☆91Nov 13, 2024Updated last year
- Indirect Prompt Injection Methodology (IPIM) - A structured process which security professionals can use to find Indirect Prompt Injectio…☆21Jul 28, 2025Updated 9 months ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- A data construction and evaluation framework to quantify privacy norm awareness of language models (LMs) and emerging privacy risk of LM …☆45Mar 4, 2025Updated last year
- [COLM'24] How Easily do Irrelevant Inputs Skew the Responses of Large Language Models?☆22Oct 13, 2024Updated last year
- Python implementation of the wavelet analysis found in Torrence and Compo (1998)☆14Nov 30, 2024Updated last year
- LLM4OR homepage project.☆26Aug 29, 2025Updated 8 months ago
- ☆11Mar 19, 2024Updated 2 years ago
- [COLING Demos 2025] an Easy-to-use Tool for Comprehensive Response Evaluation of LLMs☆38Mar 4, 2025Updated last year
- The code and datasets of our ACM MM 2024 paper "Hallu-PI: Evaluating Hallucination in Multi-modal Large Language Models within Perturbed …☆11Sep 27, 2024Updated last year
- lineno – Line numbers on paragraphs☆16Apr 8, 2026Updated last month
- A Myers–Briggs Type Indicator testing app built with React/Node/Express/MySQL☆16Jan 4, 2023Updated 3 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- [EMNLP 2024] A Peek into Token Bias: Large Language Models Are Not Yet Genuine Reasoners☆27Dec 11, 2024Updated last year
- ReLoop: RetailOpt-190 Benchmark and Codebase☆110Apr 29, 2026Updated 3 weeks ago
- 建议配合博客https://blog.csdn.net/hffhjh111/category_6634919.html 使用。☆13Jan 17, 2019Updated 7 years ago
- Implementation of our SIGIR 2017 paper : "Multitask Learning for Fine-Grained Twitter Sentiment Analysis"☆10May 1, 2018Updated 8 years ago
- SCCD:基于会话的中文网络欺凌检测数据集☆22Mar 9, 2025Updated last year
- ☆14Aug 23, 2022Updated 3 years ago
- Instruction Following Eval☆17Jan 16, 2025Updated last year