☆26Oct 22, 2025Updated 4 months ago
Alternatives and similar repositories for AgentAuditor-ASSEBench
Users that are interested in AgentAuditor-ASSEBench are comparing it to the libraries listed below
Sorting:
- Towards a Mechanistic Understanding of Large Reasoning Models: A Survey of Training, Inference, and Failures☆30Jan 29, 2026Updated last month
- [ACL 2025] Beyond Prompt Engineering: Robust Behavior Control in LLMs via Steering Target Atoms☆36Jun 4, 2025Updated 9 months ago
- The official implementation of A Unified Game-Theoretic Interpretation of Adversarial Robustness.☆22Jun 9, 2022Updated 3 years ago
- To Think or Not to Think: Exploring the Unthinking Vulnerability in Large Reasoning Models☆33May 21, 2025Updated 9 months ago
- Code for ACL 2025 Main paper "Data Whisperer: Efficient Data Selection for Task-Specific LLM Fine-Tuning via Few-Shot In-Context Learning…☆46Aug 4, 2025Updated 7 months ago
- ☆22Dec 11, 2025Updated 2 months ago
- SpectraGuru - A Spectra Analysis Application☆29Feb 25, 2026Updated last week
- ☆11Oct 31, 2024Updated last year
- ☆18Feb 16, 2025Updated last year
- Official implementation of MAS-GPT: Training LLMs to Build LLM-based Multi-Agent Systems☆75Jun 26, 2025Updated 8 months ago
- ☆11Apr 12, 2024Updated last year
- Public repo for ETH Escape CTF @ Devcon 2024: https://devcon.org/☆13Dec 11, 2024Updated last year
- MLOps community survey☆10Dec 19, 2022Updated 3 years ago
- Sentiment analysis with Vietnamese reviews from Shopee online market☆12Nov 21, 2022Updated 3 years ago
- This repository contains the implementation of a Deep Deterministic Policy Gradient (DDPG) algorithm applied to solve the Reacher environ…☆12Apr 8, 2023Updated 2 years ago
- Codes for GReTo accepted by ICLR2023☆12Mar 12, 2023Updated 2 years ago
- Long Context Research☆26Jan 26, 2026Updated last month
- A Multi-Agent Framework for Collaborative Criticism and Refinement in Table Reasoning.☆17Aug 23, 2025Updated 6 months ago
- Code for paper: "Executing Arithmetic: Fine-Tuning Large Language Models as Turing Machines"☆11Oct 11, 2024Updated last year
- ☆10Nov 13, 2022Updated 3 years ago
- Official implementation of the paper "M3CoTBench: Benchmark Chain-of-Thought of MLLMs in Medical Image Understanding"☆21Jan 14, 2026Updated last month
- A comprehensive guide and codebase for building AI agents with LLMs, featuring practical implementations from basic LLM interactions to c…☆13Jan 30, 2025Updated last year
- DICE: Detecting In-distribution Data Contamination with LLM's Internal State☆11Sep 21, 2024Updated last year
- ☆14Oct 19, 2025Updated 4 months ago
- ☆14Nov 5, 2025Updated 3 months ago
- ☆11Sep 8, 2023Updated 2 years ago
- ☆10Sep 7, 2022Updated 3 years ago
- All-in-One Safety Evaluation Framwork☆41Feb 13, 2026Updated 2 weeks ago
- ☆21Jun 16, 2025Updated 8 months ago
- T5Patches is a set of tools for fast and targeted editing of generative language models built with T5X.☆12May 31, 2024Updated last year
- ☆14Jul 4, 2022Updated 3 years ago
- ☆10Jun 25, 2021Updated 4 years ago
- A local search system implementation using Elasticsearch for Wikipedia data indexing and retrieval.☆12May 17, 2025Updated 9 months ago
- ☆11Sep 28, 2022Updated 3 years ago
- Self-hosted AI chat featuring ChatGPT, DALL·E, Firebase auth, MySQL, multi-chat rooms, conversation memory, and a responsive light/dark m…☆12May 12, 2023Updated 2 years ago
- Serial library for Micropython Unix port☆12Sep 10, 2020Updated 5 years ago
- ☆13Feb 14, 2024Updated 2 years ago
- PALI: Language identification for Perso-Arabic Scripts☆11Jul 11, 2023Updated 2 years ago
- Official code for FAccT'21 paper "Fairness Through Robustness: Investigating Robustness Disparity in Deep Learning" https://arxiv.org/abs…☆13Mar 9, 2021Updated 4 years ago