ADA-research / VERONALinks
A lightweight Python package for setting up adversarial robustness experiments and to compute robustness distributions. The package implements adversarial attacks which can be extended with the auto-verify plugin to enable complete verification.
☆34Updated last week
Alternatives and similar repositories for VERONA
Users that are interested in VERONA are comparing it to the libraries listed below
Sorting:
- Interpretability for sequence generation models 🐛 🔍☆437Updated 4 months ago
- The website for Danish Foundation Models, a project for training foundational Danish language model.☆74Updated last week
- The nnsight package enables interpreting and manipulating the internals of deep learned models.☆646Updated this week
- ☆81Updated 6 months ago
- Unified access to Large Language Model modules using NNsight☆44Updated last week
- 🪄 Interpreto is an interpretability toolbox for LLMs☆34Updated this week
- ☆336Updated 2 weeks ago
- Official JAX implementation of xLSTM including fast and efficient training and inference code. 7B model available at https://huggingface.…☆101Updated 7 months ago
- Sparse Autoencoder for Mechanistic Interpretability☆260Updated last year
- Stanford NLP Python library for understanding and improving PyTorch models via interventions☆800Updated this week
- A list of awesome open source projects in the machine learning field, who's developers are mainly based in Germany☆46Updated 11 months ago
- explainable Siamese sentence transformers☆13Updated last year
- A HuggingFace compatible Small Language Model trainer.☆76Updated 7 months ago
- ☆15Updated 2 weeks ago
- A Scandinavian Benchmark for sentence embeddings☆40Updated 3 months ago
- ☆342Updated last week
- Create feature-centric and prompt-centric visualizations for sparse autoencoders (like those from Anthropic's published research).☆216Updated 8 months ago
- A package for statistically rigorous scientific discovery using machine learning. Implements prediction-powered inference.☆255Updated 3 months ago
- Annotated implementation of vanilla Transformers to guide through all the ambiguities.☆10Updated 2 months ago
- Erasing concepts from neural representations with provable guarantees☆232Updated 7 months ago
- Benchmarks for the Evaluation of LLM Supervision☆32Updated last month
- Mechanistic Interpretability Visualizations using React☆282Updated 8 months ago
- Attribution-based Parameter Decomposition☆30Updated 2 months ago
- Repository for TabICL: A Tabular Foundation Model for In-Context Learning on Large Data☆167Updated last week
- 🔬 Interpretability for Leela Chess Zero networks.☆15Updated this week
- Library for Jacobian descent with PyTorch. It enables the optimization of neural networks with multiple losses (e.g. multi-task learning)…☆265Updated this week
- ☆238Updated 11 months ago
- ☆30Updated last week
- ☆20Updated 7 months ago
- Quantus is an eXplainable AI toolkit for responsible evaluation of neural network explanations☆621Updated last month