☆22Jan 14, 2025Updated last year
Alternatives and similar repositories for SafeInfer
Users that are interested in SafeInfer are comparing it to the libraries listed below
Sorting:
- 1st Place Team Crane: @aswinkumar1999 @rathull @kyolebu☆29Sep 8, 2025Updated 5 months ago
- [CVPR'25] Official code of paper "Mimic In-Context Learning for Multimodal Tasks"☆24Jun 8, 2025Updated 8 months ago
- ☆11Nov 12, 2024Updated last year
- SeCap: Self-Calibrating and Adaptive Prompts for Cross-view Person Re-Identification in Aerial-Ground Networks (CVPR'25)☆19Jul 1, 2025Updated 8 months ago
- The code and datasets of our ACM MM 2024 paper "Hallu-PI: Evaluating Hallucination in Multi-modal Large Language Models within Perturbed …☆11Sep 27, 2024Updated last year
- Classifier for social media images with 3 categories namely neutral, nsfw and violence.☆18Apr 4, 2023Updated 2 years ago
- ☆13Feb 24, 2025Updated last year
- Instruction Following Eval☆15Jan 16, 2025Updated last year
- Cog wrapper for playgroundai/playground-v2.5-1024px-aesthetic☆17Nov 25, 2024Updated last year
- [ACL 2025 Main] Open-source toolkit for automatic evaluation of text-to-image generation task, including training & test datasets and a d…☆16Jul 5, 2025Updated 8 months ago
- "Visual Prompt Selection for In-Context Learning Segmentation Framework"☆15Dec 13, 2024Updated last year
- ☆16Jun 19, 2023Updated 2 years ago
- SRS is an industrial-strength live cluster, with simple code and best conceptual integrity.☆11Nov 14, 2021Updated 4 years ago
- [AAAI 2023 Oral] Peeling the Onion: Hierarchical Reduction of Data Redundancy for Efficient Vision Transformer Training☆14Apr 19, 2023Updated 2 years ago
- I don't want to maintain this project, the code probably won't compile or run. Archived.☆13Feb 25, 2024Updated 2 years ago
- ☆18Nov 30, 2025Updated 3 months ago
- In-Situ Evaluator: Real-Time Subsample Analysis☆15Jan 25, 2026Updated last month
- Joint learning of object and action detectors☆15Nov 5, 2019Updated 6 years ago
- ☆15Nov 17, 2020Updated 5 years ago
- Feedback Driven Alpha Generation Pipeline combining Gemini API with the WorldQuant BRAIN API to automatically generate, backtest and iter…☆37Jan 10, 2026Updated last month
- 这是一个基于OpenCompass的模型评测系统,该系统提供了前端页面UI以方便用户自助开展评测工作。☆25Aug 25, 2025Updated 6 months ago
- SG-Bench: Evaluating LLM Safety Generalization Across Diverse Tasks and Prompt Types☆25Nov 29, 2024Updated last year
- PULSE-EVAL☆24Jan 12, 2024Updated 2 years ago
- The official repository of the paper "The Digital Cybersecurity Expert: How Far Have We Come?" presented in IEEE S&P 2025☆24May 21, 2025Updated 9 months ago
- LLM evaluation.☆16Nov 7, 2023Updated 2 years ago
- ☆21Aug 19, 2024Updated last year
- the official repo for EMNLP 2024 (main) paper "EFUF: Efficient Fine-grained Unlearning Framework for Mitigating Hallucinations in Multimo…☆20Apr 9, 2025Updated 10 months ago
- Task Complexity Classifier using Transformer-based NLP model based on Bloom's Taxonomy☆34Aug 18, 2025Updated 6 months ago
- ☆25Jun 16, 2024Updated last year
- Reproducible Language Agent Research☆34Jun 25, 2025Updated 8 months ago
- ☆27Apr 18, 2025Updated 10 months ago
- The rule-based evaluation subset and code implementation of Omni-MATH☆26Dec 23, 2024Updated last year
- VulnHeist is an Automated Penetration Testing Suite 🔖 that streamlines vulnerability scanning 🔍 and exploitation 💥 using Nmap 🌐 and …☆36Mar 22, 2025Updated 11 months ago
- ☆26Jun 5, 2024Updated last year
- 本文提出了一个基于“文心一言”的中国LLMs的安全评估基准,其中包括8种典型的安全场景和6种指令攻击类型。此外,本文还提出了安全评估的框架和过程,利用手动编写和收集开源数据的测试Prompts,以及人工干预结合利用LLM强大的评估能力作为“共同评估者”。☆33Sep 1, 2023Updated 2 years ago
- DataSciBench: An LLM Agent Benchmark for Data Science☆52Jan 21, 2026Updated last month
- Evaluator for LLMs☆27Jan 25, 2024Updated 2 years ago
- A Lightweight Visual Reasoning Benchmark for Evaluating Large Multimodal Models through Complex Diagrams in Coding Tasks☆14Feb 25, 2025Updated last year
- [ACL 2024 Main Conference] Chinese commonsense benchmark for LLMs☆44Jul 27, 2024Updated last year