GitsSaikat / Guardian-AgentLinks
Improving AI Systems with Self-Defense Mechanisms
☆19Updated 5 months ago
Alternatives and similar repositories for Guardian-Agent
Users that are interested in Guardian-Agent are comparing it to the libraries listed below
Sorting:
- ☆25Updated 2 months ago
- [ACL 2024] Do Large Language Models Latently Perform Multi-Hop Reasoning?☆73Updated 5 months ago
- The official implementation for paper "Agentic-R1: Distilled Dual-Strategy Reasoning"☆94Updated last month
- AgentSynth: Scalable Task Generation for Generalist Computer-Use Agents☆29Updated 2 months ago
- ☆19Updated 5 months ago
- This repository contains popular code generation frameworks such as MapCoder, CodeSIM.☆56Updated 2 months ago
- ☆55Updated last month
- Resa: Transparent Reasoning Models via SAEs☆41Updated 2 weeks ago
- A framework for pitting LLMs against each other in an evolving library of games ⚔☆33Updated 4 months ago
- [ACL 2025] Agentic Reward Modeling: Integrating Human Preferences with Verifiable Correctness Signals for Reliable Reward Systems☆101Updated 2 months ago
- Code for paper called Self-Training Elicits Concise Reasoning in Large Language Models☆41Updated 4 months ago
- The official implementation of Regularized Policy Gradient (RPG) (https://arxiv.org/abs/2505.17508)☆35Updated this week
- A Qwen .5B reasoning model trained on OpenR1-Math-220k☆14Updated 6 months ago
- ☆21Updated last month
- ☆34Updated 3 weeks ago
- ☆20Updated last month
- Systematic evaluation framework that automatically rates overthinking behavior in large language models.☆92Updated 3 months ago
- Source code for the collaborative reasoner research project at Meta FAIR.☆101Updated 4 months ago
- MegaScience: Pushing the Frontiers of Post-Training Datasets for Science Reasoning☆90Updated last month
- accompanying material for sleep-time compute paper☆105Updated 3 months ago
- Official codebase for "Quantile Reward Policy Optimization: Alignment with Pointwise Regression and Exact Partition Functions" (Matrenok …☆24Updated last month
- Lean implementation of various multi-agent LLM methods, including Iteration of Thought (IoT)☆118Updated 6 months ago
- Esoteric Language Models☆94Updated 3 weeks ago
- Optimizing Causal LMs through GRPO with weighted reward functions and automated hyperparameter tuning using Optuna☆55Updated 6 months ago
- The official repository for SkyLadder: Better and Faster Pretraining via Context Window Scheduling☆33Updated 3 weeks ago
- ☆213Updated 6 months ago
- Matrix (Multi-Agent daTa geneRation Infra and eXperimentation framework) is a versatile engine for multi-agent conversational data genera…☆88Updated this week
- SiriuS: Self-improving Multi-agent Systems via Bootstrapped Reasoning☆61Updated last month
- Train transformer language models with reinforcement learning.☆19Updated 6 months ago
- ☆15Updated last month