DuoGuard: A Two-Player RL-Driven Framework for Multilingual LLM Guardrails
☆31Feb 26, 2025Updated last year
Alternatives and similar repositories for DuoGuard
Users that are interested in DuoGuard are comparing it to the libraries listed below
Sorting:
- ☆19Aug 4, 2025Updated 6 months ago
- Mixture of Cognitive Reasoners: Modular Reasoning with Brain-Like Specialization☆39Feb 7, 2026Updated 3 weeks ago
- [ICLR 2025] Language Imbalance Driven Rewarding for Multilingual Self-improving☆24Aug 25, 2025Updated 6 months ago
- OpenRFT: Adapting Reasoning Foundation Model for Domain-specific Tasks with Reinforcement Fine-Tuning☆156Dec 24, 2024Updated last year
- Relative Preference Optimization: Enhancing LLM Alignment through Contrasting Responses across Identical and Diverse Prompts☆25Feb 23, 2024Updated 2 years ago
- ☆29May 4, 2024Updated last year
- Code base for internal reward models and PPO training☆24Oct 1, 2023Updated 2 years ago
- Software Engineering Back End Microservices Project☆15Nov 20, 2024Updated last year
- This is the official repository for paper: cross-modal information flow in multimodal large language models☆41May 21, 2025Updated 9 months ago
- Implementation of ACL-2021 paper: Cross-lingual Text Classification with Heterogeneous Graph Neural Network.☆29May 25, 2021Updated 4 years ago
- Vstream - Video Analytics pipeline with Hardware based accelerations (dev - stage)☆10Feb 2, 2024Updated 2 years ago
- ☆40May 24, 2024Updated last year
- ☆37Oct 2, 2024Updated last year
- 这是我的博客《不用框架,使用Python搭建基于numpy的卷积神经网络来进行cifar-10分类的深度学习系统》的代码实现。☆10Jul 1, 2019Updated 6 years ago
- A Multi-Session and Multi-Therapy Benchmark for High-Realism AI Psychological Counselor☆29Jan 13, 2026Updated last month
- Support for training SSD on TF2☆12Mar 29, 2023Updated 2 years ago
- Use MobileNet SSD and openCV to detect and count car on road☆12Jan 13, 2020Updated 6 years ago
- ☆56May 21, 2025Updated 9 months ago
- An implementation of MSSRM method☆11Mar 23, 2023Updated 2 years ago
- The code for the paper "A Bayesian Approach to Online Planning" published in ICML 2024.☆13Jun 17, 2024Updated last year
- ☆14May 1, 2023Updated 2 years ago
- ☆16Sep 17, 2024Updated last year
- ☆11Jan 11, 2022Updated 4 years ago
- Cross-lingual Fact-to-Text Alignment and Generation for Low-Resource Languages☆11Jan 1, 2023Updated 3 years ago
- code for polite☆11Feb 28, 2024Updated 2 years ago
- Model for Udacity's challenge which uses end-to-end learning to predict steering angles from just front camera image as input for self dr…☆10Apr 21, 2017Updated 8 years ago
- ☆14Mar 21, 2024Updated last year
- The official repo for "CodeScaler: Scaling Code LLM Training and Test-Time Inference via Execution-Free Reward Models"☆29Feb 23, 2026Updated last week
- A comprehensive ELT pipeline for analyzing passenger satisfaction data. Features a modern data architecture with Apache Airflow for extra…☆12Oct 5, 2025Updated 4 months ago
- Vietnamese GPT-J API service deployed with Docker & Helm chart☆10Dec 11, 2022Updated 3 years ago
- MemRec☆37Jan 16, 2026Updated last month
- Official repository for ICLR 2025 paper "Amulet: ReAlignment During Test Time for Personalized Preference Adaptation of LLMs"☆16Mar 18, 2025Updated 11 months ago
- Precision Knowledge Editing (PKE): A novel method to reduce toxicity in LLMs while preserving performance, with robust evaluations and ha…☆11Nov 26, 2024Updated last year
- Teaching a humanoid to walk(ish), then displaying in your browser (using tensorflow.js and reinforcement learning)☆10Sep 7, 2020Updated 5 years ago
- DreamSmooth: Improving Model-Based RL with Reward Smoothing (ICLR 2024)☆12May 6, 2024Updated last year
- yolo目标检测算法☆15Jul 27, 2025Updated 7 months ago
- LLM Skirmish☆44Feb 3, 2026Updated last month
- ☆16Jan 16, 2025Updated last year
- Fleming-R1: Toward Expert-Level Medical Reasoning via Reinforcement Learning☆30Sep 29, 2025Updated 5 months ago