A mechanistic approach for understanding and detecting factual errors of large language models.
☆49Jul 6, 2024Updated last year
Alternatives and similar repositories for mechanistic-error-probe
Users that are interested in mechanistic-error-probe are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- quica is a tool to run inter coder agreement pipelines in an easy and effective ways. Multiple measures are run and results are collected…☆23Nov 9, 2020Updated 5 years ago
- ☆85Mar 26, 2024Updated 2 years ago
- Exploration into the proposed "Self Reasoning Tokens" by Felipe Bonetto☆57May 17, 2024Updated 2 years ago
- Some numerical optimization methods implemented in Haskell☆47Jun 24, 2020Updated 5 years ago
- 👁️ Isometric 3D Graphing / Rendering module for Haskell☆15Sep 2, 2017Updated 8 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- ☆10Nov 1, 2022Updated 3 years ago
- ☆12Oct 25, 2023Updated 2 years ago
- Shared code for training sentence embeddings with Flax / JAX☆28Jul 15, 2021Updated 4 years ago
- ☆16Sep 27, 2023Updated 2 years ago
- A repository to keep tools, scripts, data for SMART task.☆11May 24, 2022Updated 4 years ago
- ☆21Mar 19, 2024Updated 2 years ago
- IAI Style Guide☆11Jun 27, 2025Updated 11 months ago
- A Dataset and Results for Classifying Emotions Across Languages☆10Jun 20, 2021Updated 4 years ago
- Tidy autoregressive inference in JAX☆15Sep 1, 2025Updated 9 months ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Code for my NeurIPS 2024 ATTRIB paper titled "Attribution Patching Outperforms Automated Circuit Discovery"☆47May 31, 2024Updated 2 years ago
- Belief in the Machine: Investigating Epistemological Blind Spots of Language Models☆34Apr 19, 2025Updated last year
- ☆10Jul 13, 2024Updated last year
- ☆15Apr 26, 2025Updated last year
- [ICME 2019] Source code and datasets for "Semi-supervised Compatibility Learning Across Categories for Clothing Matching"☆11Apr 26, 2024Updated 2 years ago
- Data and download script to accompany LREC2020 paper "Automated Fact-Checking of Claims from Wikipedia"☆13Jul 19, 2023Updated 2 years ago
- Code for our paper "Decomposing The Dark Matter of Sparse Autoencoders"☆23Feb 6, 2025Updated last year
- Code for the paper "Understanding RL Vision"☆51Apr 2, 2023Updated 3 years ago
- Zero-Shot Learning in Named Entity Recognition with Common Sense Knowledge☆17Nov 16, 2021Updated 4 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- This is the code corresponding to our publication introducing ConvDecoder with physics-based regularization (CD+r) for MRI☆10Feb 6, 2026Updated 4 months ago
- Mental state inference from observable behavior☆15Dec 3, 2021Updated 4 years ago
- Python package to compute interaction indices that extend the Shapley Value. AISTATS 2023.☆19Sep 25, 2023Updated 2 years ago
- LSTM-VAE for Time Series Anomaly Detection☆10Feb 21, 2021Updated 5 years ago
- Beyond Myopia: Learning from Positive and Unlabeled Data through Holistic Predictive Trends [NeurIPS 2023]☆10Jan 28, 2024Updated 2 years ago
- This repo contains the ToMnet+ model for preference inference. Developed by Yun-Shiuan, Edwinn, Hsin-Yi, and Elaine.☆10Feb 24, 2023Updated 3 years ago
- [CVPR'25] Official code of paper "Mimic In-Context Learning for Multimodal Tasks"☆26May 21, 2026Updated 3 weeks ago
- This repository contains data, code and models for contextual noncompliance.☆26Jul 18, 2024Updated last year
- ☆15Jul 12, 2024Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Code and data for the paper "Understanding Hidden Context in Preference Learning: Consequences for RLHF"☆34Dec 14, 2023Updated 2 years ago
- A Python package to compute HONEST, a score to measure hurtful sentence completions in language models. Published at NAACL 2021.☆21Apr 8, 2025Updated last year
- Codebase the paper "The Remarkable Robustness of LLMs: Stages of Inference?"☆20Jun 11, 2025Updated last year
- ☆23Jun 8, 2022Updated 4 years ago
- Find context neurons in Pythia models.☆13Jun 13, 2023Updated 3 years ago
- Code for "A Multi-Task BERT Model for Schema-Guided Dialogue State Tracking"☆14May 26, 2023Updated 3 years ago
- Interaction-first method for generating demonstrations for web-agents on any website☆56Apr 29, 2025Updated last year