The code of “Improving Weak-to-Strong Generalization with Scalable Oversight and Ensemble Learning”
☆17Feb 26, 2024Updated 2 years ago
Alternatives and similar repositories for W2SG
Users that are interested in W2SG are comparing it to the libraries listed below
Sorting:
- Safety-J: Evaluating Safety with Critique☆16Jul 28, 2024Updated last year
- Evaluate the Quality of Critique☆36Jun 1, 2024Updated last year
- [ACL 2024] Code for "MoPS: Modular Story Premise Synthesis for Open-Ended Automatic Story Generation"☆43Jul 19, 2024Updated last year
- Implementation of Influence Function approximations for differently sized ML models, using PyTorch☆16Sep 15, 2023Updated 2 years ago
- ☆78May 22, 2024Updated last year
- ☆22Feb 26, 2024Updated 2 years ago
- ☆21Mar 17, 2025Updated 11 months ago
- [AAAI 2024] MELO: Enhancing Model Editing with Neuron-indexed Dynamic LoRA☆27Apr 9, 2024Updated last year
- ☆27Mar 27, 2025Updated 11 months ago
- A curated list of awesome resources dedicated to Scaling Laws for LLMs☆81Apr 10, 2023Updated 2 years ago
- ☆38Feb 8, 2024Updated 2 years ago
- BeHonest: Benchmarking Honesty in Large Language Models☆34Aug 15, 2024Updated last year
- A Framework for Evaluating AI Agent Safety in Realistic Environments☆30Oct 2, 2025Updated 5 months ago
- Trending projects & awesome papers about data-centric llm studies.☆40May 20, 2025Updated 9 months ago
- ☆85Jan 25, 2025Updated last year
- EMNLP 2024: Model Editing Harms General Abilities of Large Language Models: Regularization to the Rescue☆38May 26, 2025Updated 9 months ago
- MTalk-Bench: Evaluating Speech-to-Speech Models in Multi-Turn Dialogues via Arena-style and Rubrics Protocols☆17Nov 19, 2025Updated 3 months ago
- TOD-Flow: Modeling the Structure of Task-Oriented Dialogues☆13Feb 7, 2024Updated 2 years ago
- Meta-Reinforcement Learning with Policy Residual Representation☆11Aug 15, 2019Updated 6 years ago
- ☆44Sep 19, 2024Updated last year
- ☆98Jun 27, 2024Updated last year
- Code and data for NAACL 2025 paper "IHEval: Evaluating Language Models on Following the Instruction Hierarchy"☆17Feb 25, 2025Updated last year
- A python tool help to interact with chatgpt.☆10Dec 11, 2022Updated 3 years ago
- ☆16Mar 17, 2025Updated 11 months ago
- Information Extraction related tools and models☆10Mar 16, 2023Updated 2 years ago
- FamilyTool benchmark☆12Sep 10, 2025Updated 5 months ago
- The repo for paper: Exploiting the Index Gradients for Optimization-Based Jailbreaking on Large Language Models.☆13Dec 16, 2024Updated last year
- Prompt Generator model for Stable Diffusion Models☆11Jun 20, 2023Updated 2 years ago
- The source code of "Empowering Language Understanding with Counterfactual Reasoning" (ACL'21)☆11Sep 3, 2021Updated 4 years ago
- Code for the NAACL 2024 HCI+NLP Workshop paper "LLMCheckup: Conversational Examination of Large Language Models via Interpretability Tool…☆13Mar 24, 2024Updated last year
- a robust metric (robust fidelity) for XGNN (ICLR24)☆12Jun 3, 2025Updated 9 months ago
- ☆12Oct 5, 2022Updated 3 years ago
- Code for Fast Propagation is Better: Accelerating Single-Step Adversarial Training via Sampling Subnetworks (TIFS2024)☆13Mar 29, 2024Updated last year
- a Video Quality Analysis Toolkit☆13May 16, 2025Updated 9 months ago
- Code for the paper "SizeShiftReg: a Regularization Method for Improving Size-Generalization in Graph Neural Networks"☆12Jan 17, 2023Updated 3 years ago
- Scripts for KGIRNet model for ESWC☆10Jul 6, 2023Updated 2 years ago
- ☆10Dec 18, 2024Updated last year
- ☆20Feb 3, 2025Updated last year
- Public code release for the paper "Reawakening knowledge: Anticipatory recovery from catastrophic interference via structured training"☆11Oct 27, 2025Updated 4 months ago