thestephencasper / gpt4_bsLinks
Examples of prompts that cause ChatGPT-4 to hallucinate.
☆31Updated 2 years ago
Alternatives and similar repositories for gpt4_bs
Users that are interested in gpt4_bs are comparing it to the libraries listed below
Sorting:
- The codebase for Inducing Causal Structure for Interpretable Neural Networks☆10Updated 3 years ago
- ☆106Updated 7 months ago
- Steering vectors for transformer language models in Pytorch / Huggingface☆124Updated 7 months ago
- ☆81Updated 7 months ago
- Erasing concepts from neural representations with provable guarantees☆237Updated 8 months ago
- Utilities for the HuggingFace transformers library☆72Updated 2 years ago
- Highlight errors in a bib file: missing URLs, capitalization protection, etc☆27Updated last year
- ☆22Updated 5 months ago
- NeuroSurgeon is a package that enables researchers to uncover and manipulate subnetworks within models in Huggingface Transformers☆41Updated 7 months ago
- Code for my NeurIPS 2024 ATTRIB paper titled "Attribution Patching Outperforms Automated Circuit Discovery"☆41Updated last year
- ☆127Updated last year
- Materials for EACL2024 tutorial: Transformer-specific Interpretability☆60Updated last year
- ☆242Updated last year
- Unified access to Large Language Model modules using NNsight☆47Updated 2 weeks ago
- PyTorch and NNsight implementation of AtP* (Kramar et al 2024, DeepMind)☆19Updated 8 months ago
- How do transformer LMs encode relations?☆53Updated last year
- ☆123Updated last year
- Implementation of the BatchTopK activation function for training sparse autoencoders (SAEs)☆48Updated 2 months ago
- Evaluate interpretability methods on localizing and disentangling concepts in LLMs.☆54Updated 11 months ago
- We develop benchmarks and analysis tools to evaluate the causal reasoning abilities of LLMs.☆126Updated last year
- ☆51Updated 2 months ago
- ☆276Updated last year
- Data and code for the Corr2Cause paper (ICLR 2024)☆111Updated last year
- ☆109Updated 7 months ago
- Attribution-based Parameter Decomposition☆30Updated 3 months ago
- Find context neurons in Pythia models.☆14Updated 2 years ago
- Sparse probing paper full code.☆61Updated last year
- Keeping language models honest by directly eliciting knowledge encoded in their activations.☆209Updated last week
- ☆18Updated 7 months ago
- A library for mechanistic anomaly detection☆22Updated 8 months ago