☆38Feb 16, 2024Updated 2 years ago
Alternatives and similar repositories for LLM-evaluation-datasets
Users that are interested in LLM-evaluation-datasets are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- An Interactive Hex-Rays Microcode Explorer☆17Feb 8, 2024Updated 2 years ago
- Writeup and exploit for CVE-2025-22441: Privilege escalation from installed app to SystemUI process on Android due to pass of untrusted A…☆101Oct 8, 2025Updated 8 months ago
- Cross-Site Scripting (XSS) is a common vulnerability that allows attackers to inject malicious scripts into web pages viewed by users. In…☆11Sep 10, 2024Updated last year
- Coverage gathering JVMTI agent for Android☆28Oct 11, 2023Updated 2 years ago
- ☆14May 7, 2024Updated 2 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- ☆12Jul 8, 2022Updated 3 years ago
- Benchmarks for the VNN Comp 2023☆16Jun 7, 2024Updated 2 years ago
- [NAACL 2024 Findings] Deja vu: Contrastive Historical Modeling with Prefix-tuning for Temporal Knowledge Graph Reasoning☆14Jul 8, 2024Updated last year
- Red Team AI Benchmark: Evaluating Uncensored LLMs for Offensive Security☆50Jun 22, 2026Updated last week
- Binary Ninja deobfuscation plugin☆22Jul 23, 2025Updated 11 months ago
- KeySentry – Find leaked API keys & secrets in any GitHub repo. No mercy.☆42May 29, 2026Updated last month
- The official repo for "OpenMoE 2: Sparse Diffusion Language Models".☆58Dec 28, 2025Updated 6 months ago
- For Certified Robustness to Text Adversarial Attacks by Randomized [MASK]☆17Oct 8, 2024Updated last year
- Keeps track of popular provable training and verification approaches towards robust neural networks, including leaderboards on popular da…☆19Jun 12, 2024Updated 2 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- [NeurIPS'24] RedCode: Risky Code Execution and Generation Benchmark for Code Agents☆82Apr 24, 2026Updated 2 months ago
- frida脚本集合☆36Feb 6, 2026Updated 4 months ago
- ☆48Feb 10, 2025Updated last year
- ☆32Sep 13, 2024Updated last year
- 🥇 Amazon Nova AI Challenge Winner - ASTRA emerged victorious as the top attacking team in Amazon's global AI safety competition, defeati…☆73May 11, 2026Updated last month
- Code for the benchmarking single-cell foundation models (scGPT, scBERT, and Geneformer) for cell-type annotation task using skewed single…☆15Dec 8, 2024Updated last year
- Heart Murmur Detection from Phonocardiogram Recordings: The George B. Moody PhysioNet Challenge 2022☆15Jan 6, 2026Updated 5 months ago
- This is the official implementation of TAGCOS: Task-agnostic Gradient Clustered Coreset Selection for Instruction Tuning Data☆13Jul 21, 2024Updated last year
- a secret detection tool☆40Mar 1, 2026Updated 3 months ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Extension for CoEdPilot☆21Feb 25, 2026Updated 4 months ago
- ios application class-dump use frida☆39Apr 28, 2023Updated 3 years ago
- 大创项目,层级注意力机器翻译☆17Apr 12, 2021Updated 5 years ago
- A polyglot static analysis engine for detecting vulnerabilities in scripting languages native extensions based on joern.☆22Sep 1, 2025Updated 9 months ago
- ☆15Mar 22, 2021Updated 5 years ago
- Let us control diffusion models☆13Feb 19, 2023Updated 3 years ago
- ☆30Dec 23, 2025Updated 6 months ago
- ☆12Jun 30, 2024Updated 2 years ago
- ☆30Aug 21, 2025Updated 10 months ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Clustering and Ranking: Diversity-preserved Instruction Selection through Expert-aligned Quality Estimation☆91Nov 13, 2024Updated last year
- Indirect Prompt Injection Methodology (IPIM) - A structured process which security professionals can use to find Indirect Prompt Injectio…☆21Jul 28, 2025Updated 11 months ago
- A data construction and evaluation framework to quantify privacy norm awareness of language models (LMs) and emerging privacy risk of LM …☆46Mar 4, 2025Updated last year
- Python implementation of the wavelet analysis found in Torrence and Compo (1998)☆14Nov 30, 2024Updated last year
- ☆12Mar 19, 2024Updated 2 years ago
- AgenTracer: A Lightweight Failure Attributor for Agentic Systems☆96Nov 12, 2025Updated 7 months ago
- origin S_transform matlab code transfered to Python☆20Mar 22, 2019Updated 7 years ago