AlignmentResearch / scaling-poisoningView external linksLinks
☆16Nov 18, 2024Updated last year
Alternatives and similar repositories for scaling-poisoning
Users that are interested in scaling-poisoning are comparing it to the libraries listed below
Sorting:
- Auditing agents for fine-tuning safety☆18Oct 21, 2025Updated 3 months ago
- Project exploring 3D volumetric rendering of NEXRAD radar data.☆11Oct 23, 2023Updated 2 years ago
- 专用于搭建MT4或MT5交易跟单平台☆25Updated this week
- [USENIX'25] HateBench: Benchmarking Hate Speech Detectors on LLM-Generated Content and Hate Campaigns☆13Mar 1, 2025Updated 11 months ago
- A library for training crosscoders☆15May 28, 2025Updated 8 months ago
- A collection of of reusable workflows and composite actions to help developers kickstart their pipelines.☆13Oct 11, 2024Updated last year
- Fast wavelet transforms on the sphere☆13Dec 20, 2016Updated 9 years ago
- Creating and querying materialized views from Django.☆11Aug 13, 2021Updated 4 years ago
- Create my own language in Compilers Principle Lab, I call it Quary. In this repository, I provide all the source code.☆12Jan 25, 2021Updated 5 years ago
- Customizable charts made with TikZ and LaTeX3☆14Feb 11, 2023Updated 3 years ago
- The AI that helps you achieve your goals☆11Feb 4, 2024Updated 2 years ago
- ☆12Feb 6, 2026Updated last week
- ICML2025: One Image is Worth a Thousand Words: A Usability Preservable Text-Image Collaborative Erasing Framework☆14Jun 24, 2025Updated 7 months ago
- see github.com/understanding-search/maze-transformer☆10Dec 8, 2023Updated 2 years ago
- An analog touch screen joystick that pretends to be a bevy gamepad☆13Jul 13, 2024Updated last year
- To mitigate position bias in LLMs, especially in long-context scenarios, we scale only one dimension of LLMs, reducing position bias and …☆11Jun 18, 2024Updated last year
- ✒️ A gallery of experiments with Scalable Vector Graphics (SVG) and interactive visualizations.☆13Jan 6, 2023Updated 3 years ago
- The implementation of our IEEE S&P 2024 paper "Securely Fine-tuning Pre-trained Encoders Against Adversarial Examples".☆11Jun 28, 2024Updated last year
- Real-time sensor for the Cambridge Coffee Pot (Computer Lab)☆12Mar 24, 2020Updated 5 years ago
- Accompanying codebase for neuroscope.io, a website for displaying max activating dataset examples for language model neurons☆13Feb 13, 2023Updated 3 years ago
- Project of ACL 2025 "UAlign: Leveraging Uncertainty Estimations for Factuality Alignment on Large Language Models"☆15Mar 25, 2025Updated 10 months ago
- 🧠 Inspecting complexity and goal-directedness of imagination in an fNIRS BCI system.☆11Aug 26, 2023Updated 2 years ago
- [COLING 2025🔥] Evolver: Chain-of-Evolution Prompting to Boost Large Multimodal Models for Hateful Meme Detection☆16Jan 21, 2025Updated last year
- ☆17Jan 5, 2026Updated last month
- Flight Recorder allows to record client program execution and examine it later☆11Sep 18, 2020Updated 5 years ago
- Simple (fast) transformer inference in PyTorch with torch.compile + lit-llama code☆10Aug 29, 2023Updated 2 years ago
- Official repository for WWW'24 paper "MemeCraft: Contextual and Stance-Driven Multimodal Meme Generation"☆12Jul 25, 2024Updated last year
- 2019-基于知识图谱的北邮校园信息化领域智能问答系统☆10May 1, 2023Updated 2 years ago
- enchmarking Large Language Models' Resistance to Malicious Code☆14Dec 1, 2024Updated last year
- ☆16Feb 17, 2025Updated 11 months ago
- Automated terminal emulator benchmarks☆22Jan 14, 2026Updated 3 weeks ago
- Early exit ensembles☆12Dec 4, 2021Updated 4 years ago
- code of paper "Defending Against Alignment-Breaking Attacks via Robustly Aligned LLM"☆14Nov 17, 2023Updated 2 years ago
- Experiments with representation engineering☆13Feb 28, 2024Updated last year
- [NDSS'25] The official implementation of safety misalignment.☆17Jan 8, 2025Updated last year
- ☆11Jun 8, 2023Updated 2 years ago
- Javascript SDK for CopyFactory trade copying API. Can copy trades both between MetaTrader 5 (MT5) and MetaTrader 4 (MT4)☆17Nov 8, 2023Updated 2 years ago
- A tiny easily hackable implementation of a feature dashboard.☆15Oct 21, 2025Updated 3 months ago
- ☆12Oct 23, 2022Updated 3 years ago