umsi-arwhyte / SI506-practice
Retired problem sets and lab exercises made available for self-study.
☆16Updated 3 years ago
Alternatives and similar repositories for SI506-practice:
Users that are interested in SI506-practice are comparing it to the libraries listed below
- ☆277Updated 4 years ago
- A Chinese Translation of Stanford CS229 notes 斯坦福机器学习CS229课程讲义的中文翻译☆242Updated 2 years ago
- This is the code repository for "Uncovering Safety Risks of Large Language Models through Concept Activation Vector"☆25Updated 3 months ago
- ☆41Updated last year
- A survey on harmful fine-tuning attack for large language model☆136Updated this week
- ☆28Updated 8 months ago
- Solutions for CS224n (2022)☆59Updated 10 months ago
- [NAACL2024] Attacks, Defenses and Evaluations for LLM Conversation Safety: A Survey☆87Updated 6 months ago
- Weak-to-Strong Jailbreaking on Large Language Models☆72Updated last year
- ☆22Updated 10 months ago
- Official Code for "Baseline Defenses for Adversarial Attacks Against Aligned Language Models"☆22Updated last year
- Implementation for "RigorLLM: Resilient Guardrails for Large Language Models against Undesired Content"☆20Updated 6 months ago
- Accepted by ECCV 2024☆99Updated 4 months ago
- The reinforcement learning codes for dataset SPA-VL☆28Updated 7 months ago
- Code for ACM MM2024 paper: White-box Multimodal Jailbreaks Against Large Vision-Language Models☆22Updated last month
- Official repository for ICML 2024 paper "On Prompt-Driven Safeguarding for Large Language Models"☆84Updated 5 months ago
- The answers for all labs, hws, and projects in Data100(DS100)☆46Updated 2 years ago
- Official Repository for The Paper: Safety Alignment Should Be Made More Than Just a Few Tokens Deep☆73Updated 7 months ago
- [ACL 2024] Defending Large Language Models Against Jailbreaking Attacks Through Goal Prioritization☆19Updated 7 months ago
- A resource repository for representation engineering in large language models☆104Updated 3 months ago
- A resource repository for machine unlearning in large language models☆314Updated this week
- ☆46Updated 7 months ago
- This repository contains my solutions to the assignments for Stanford's CS224n "Natural Language Processing with Deep Learning" (Winter 2…☆130Updated last year
- ☆41Updated 8 months ago
- "Simplicity Prevails: Rethinking Negative Preference Optimization for LLM Unlearning" by Chongyu Fan*, Jiancheng Liu*, Licong Lin*, Jingh…☆21Updated last month
- Introduction to Signals and Systems 信号系统导论☆10Updated 7 months ago
- ☆20Updated 7 months ago