BullshitBench measures whether AI models challenge nonsensical prompts instead of confidently answering them, created by Peter Gostev.
☆1,560Apr 28, 2026Updated this week
Alternatives and similar repositories for bullshit-benchmark
Users that are interested in bullshit-benchmark are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A lifeline for people dealing with Windows, especially after using macOS.☆12Apr 24, 2022Updated 4 years ago
- [ Arxiv 2023 ] This repository contains the code for "MUPPET: Multi-Modal Few-Shot Temporal Action Detection"☆15Aug 30, 2023Updated 2 years ago
- A framework aiming to bridge fast robot prototyping, predefined motion primitives, heterogeneous teleoperation, data collection, and flex…☆26Apr 4, 2026Updated 3 weeks ago
- Official repository Flash Local Linear Attention☆23Apr 23, 2026Updated last week
- Chat AI (↓↓Scroll to see more↓↓)☆27Jul 24, 2024Updated last year
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- OERbit is a Drupal-based publishing platform for publicly licensed learning resources (OER/OCW)☆15Jun 17, 2015Updated 10 years ago
- ☆10Dec 18, 2025Updated 4 months ago
- Scripts for importing threat feeds and CTI articles, blogs, and reports into MISP.☆18Jun 16, 2025Updated 10 months ago
- Show differences between directory trees☆15Aug 9, 2025Updated 8 months ago
- Network for procedural editing of text with LLMs☆23Updated this week
- Reasoning Activation in LLMs via Small Model Transfer (NeurIPS 2025)☆22Oct 16, 2025Updated 6 months ago
- MLX Implementation of Recursive Reasoning with Tiny Networks☆78Oct 11, 2025Updated 6 months ago
- ☆18Oct 21, 2024Updated last year
- Official implementation for SSDD Single-Step Diffusion Decoder for Efficient Image Tokenization.☆60Mar 16, 2026Updated last month
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Simple and Ideal Circuit Simulation☆13Dec 4, 2017Updated 8 years ago
- An experimental and alternative approach to Finetuning and RAG.☆34Dec 9, 2023Updated 2 years ago
- PeRL: Parameter-Efficient Reinforcement Learning☆75Apr 21, 2026Updated last week
- Pipeline for generating RNAseq-based cancer patient reports☆12Apr 15, 2026Updated 2 weeks ago
- ☆12Sep 15, 2025Updated 7 months ago
- RIESLING: Super-enhancer identification. (Rapid Identification of EnhancerS LInked to Nearby Genes)☆10Jun 28, 2018Updated 7 years ago
- A framework for benchmarking embedding models in hybrid search scenarios (BM25 + vector search) using Weaviate.☆38Apr 22, 2026Updated last week
- Pynocular is a lightweight ORM that lets you query your database using Pydantic models and asyncio☆12May 24, 2022Updated 3 years ago
- "This Pokemon-esque, unlicensed Taiwanese Gameboy RPG rocks."☆13Aug 12, 2021Updated 4 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- An extensive and commented list of resources on Learned Sparse Retrieval.☆51Updated this week
- An unnecessarily tiny and minimal implementation of GPT-2 in NumPy.☆11Feb 12, 2023Updated 3 years ago
- List of OSINT Capture The Flag platforms☆55Mar 27, 2026Updated last month
- A Jupyter notebook building and training a Neural Network from scratch with NumPy☆12Aug 7, 2022Updated 3 years ago
- ZwZ model family: SOTA fine-grained perception performace; ZoomBench: a new challenging perception benchmark☆124Mar 9, 2026Updated last month
- Mental stress has become a standard part of day-to-day life. However, experiencing long-term and high-level stress affects the daily life…☆15Dec 8, 2022Updated 3 years ago
- A collection of molecular modelling tools for UCSF Chimera☆18Mar 26, 2019Updated 7 years ago
- Brainwave is a state-of-the-art neural decoder that transforms electroencephalogram (EEG) and brain signals into multimodal outputs inclu…☆14Oct 6, 2025Updated 6 months ago
- A framework for evaluating function calls made by LLMs☆40Jul 23, 2024Updated last year
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- A minimalist Docker project to help people getting started with Node, WizardCoder, CTransformers, Python, Express and TypeScript. Ready t…☆14Jun 23, 2023Updated 2 years ago
- Vecty + three.js = ♡☆11Aug 26, 2018Updated 7 years ago
- A Chrome Extension to monitor Google Wifi (or OnHub) status.☆11Apr 23, 2021Updated 5 years ago
- LEMMA: Logical Engine for Multi-domain Mathematical Analysis☆28Feb 14, 2026Updated 2 months ago
- PLLay: Efficient Topological Layer based on Persistence Landscapes☆23Dec 10, 2020Updated 5 years ago
- Official implementation of "A simple, efficient and scalable contrastive masked autoencoder for learning visual representations".☆37Apr 3, 2023Updated 3 years ago
- ☆22Mar 4, 2025Updated last year