☆34Nov 7, 2024Updated last year
Alternatives and similar repositories for Truth_is_Universal
Users that are interested in Truth_is_Universal are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆107Aug 8, 2024Updated last year
- Improving transparency of large language models' reasoning☆15Nov 25, 2025Updated 6 months ago
- ☆92Jan 22, 2025Updated last year
- ☆27Jun 10, 2025Updated 11 months ago
- Official Code for our paper: "Language Models Learn to Mislead Humans via RLHF""☆19Oct 11, 2024Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- A series of BERT and Albert model checkpoints trained to reduce gendered correlations in pre-training☆11Oct 22, 2020Updated 5 years ago
- Creating a game to play Figgie & Train an agent to play against☆15Dec 3, 2022Updated 3 years ago
- ☆33Nov 16, 2025Updated 6 months ago
- Code for "On Measuring Faithfulness of Natural Language Explanations"☆22Jul 23, 2024Updated last year
- Code for the ICLR 2024 paper "How to catch an AI liar: Lie detection in black-box LLMs by asking unrelated questions"☆72Jun 19, 2024Updated last year
- Codebase for "On Targeted Manipulation and Deception when Optimizing LLMs for User Feedback". This repo implements a generative multi-tur…☆25Dec 3, 2024Updated last year
- ☆17Feb 26, 2024Updated 2 years ago
- Prolog specification of TensorFlow layers☆13Jun 12, 2023Updated 2 years ago
- Official Implementation of "The Graph Database Interface: Scaling Online Transactional and Analytical Graph Workloads to Hundreds of Thou…☆14Jul 2, 2025Updated 10 months ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Scalable DBSCAN and OPTICS for clustering high-dimensional datasets using random projections☆14Nov 1, 2024Updated last year
- Code for "Astraea: Grammar-based Fairness Testing"☆10Jan 7, 2022Updated 4 years ago
- ☆16Apr 27, 2024Updated 2 years ago
- Template for Python-based data science projects in the Alexandra Institute.☆12May 14, 2026Updated 2 weeks ago
- Explore, Establish, Exploit: Red Teaming Language Models from Scratch☆15Jun 21, 2023Updated 2 years ago
- Intersectional bias in hate speech and abusive language datasets☆15Jan 25, 2024Updated 2 years ago
- Open source replication of Anthropic's Crosscoders for Model Diffing☆66Oct 27, 2024Updated last year
- LoFiT: Localized Fine-tuning on LLM Representations☆45Jan 15, 2025Updated last year
- Can We Trust Large Language Models?: A Benchmark for Responsible Large Language Models via Toxicity, Bias, and Value-alignment Evaluation☆26Oct 12, 2023Updated 2 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- ☆18May 18, 2021Updated 5 years ago
- Sparse probing paper full code.☆68Dec 17, 2023Updated 2 years ago
- Attribution-based Parameter Decomposition☆34Jun 11, 2025Updated 11 months ago
- ☆20Aug 26, 2018Updated 7 years ago
- Data set for LREC 2020 paper "I Feel Offended, Don't Be Abusive!"☆19Sep 23, 2023Updated 2 years ago
- 😇A curated list of links and resources for Fair ML and Data Ethics☆18May 31, 2022Updated 3 years ago
- Analogs of Linguistic Structure in Deep Representations☆19Jul 27, 2017Updated 8 years ago
- My presentation on Cyber Grand Challenge and DEFCON 24 CTF at SHLUG monthly meeting☆13Sep 24, 2016Updated 9 years ago
- Discriminative Feature Selection via A Structured Sparse Subspace Learning Module☆12Apr 15, 2022Updated 4 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- ☆191Mar 8, 2026Updated 2 months ago
- ☆96Mar 13, 2026Updated 2 months ago
- Official code for "Vision Transformers with Self-Distilled Registers" (NeurIPS 2025 Spotlight)☆34Dec 6, 2025Updated 5 months ago
- My personal CV in a git-inspired style. Check it out!☆10Mar 29, 2024Updated 2 years ago
- Accompanying codebase for neuroscope.io, a website for displaying max activating dataset examples for language model neurons☆14Feb 13, 2023Updated 3 years ago
- ☆12Mar 19, 2021Updated 5 years ago
- Convert java 11 JFR to folded format for https://github.com/brendangregg/FlameGraph☆12May 23, 2019Updated 7 years ago