TristanThrush/i-am-a-strange-dataset

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/TristanThrush/i-am-a-strange-dataset)

TristanThrush / i-am-a-strange-dataset

Repository for "I am a Strange Dataset: Metalinguistic Tests for Language Models"

☆46

Alternatives and similar repositories for i-am-a-strange-dataset

Users that are interested in i-am-a-strange-dataset are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

choidami / inductive-oocr
View on GitHub
☆16Mar 22, 2025Updated last year
duskybomb / hopfield-network
View on GitHub
Implementation of Hopfield Neural Network in Python based on Hebbian Learning Algorithm
☆13Aug 10, 2019Updated 6 years ago
TristanThrush / perplexity-correlations
View on GitHub
Simple and scalable tools for data-driven pretraining data selection.
☆30Jun 9, 2025Updated last year
havenpersona / lycon
View on GitHub
Copyright-free Artificial Lyrics Dataset (ISMIR 2024 LBD)
☆12Sep 1, 2024Updated last year
philschmid / deep-learning-habana-huggingface
View on GitHub
☆33Dec 9, 2022Updated 3 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
vedantpalit / Towards-Vision-Language-Mechanistic-Interpretability
View on GitHub
This is the official repository for the "Towards Vision-Language Mechanistic Interpretability: A Causal Tracing Tool for BLIP" paper acce…
☆25Feb 16, 2026Updated 5 months ago
AntonMu / Census2020
View on GitHub
Visualizing 230 years of US Census data
☆12Feb 23, 2020Updated 6 years ago
sfeucht / footprints
View on GitHub
https://footprints.baulab.info
☆17Oct 4, 2024Updated last year
VSJMilewski / multimodal-probes
View on GitHub
Code base for paper "Finding Structural Knowledge in Multimodal-BERT". Framework for probing and code for creating Scene Trees.
☆10May 19, 2022Updated 4 years ago
affjljoo3581 / polyglot-jax-inference
View on GitHub
TPU에서 한국어용 LLM 추론을 위한 Jax/Flax 구현체입니다.
☆12Jun 12, 2023Updated 3 years ago
Tebmer / Rereading-LLM-Reasoning
View on GitHub
EMNLP 2024 "Re-reading improves reasoning in large language models". Simply repeating the question to get bidirectional understanding for…
☆30Dec 10, 2024Updated last year
gonglinyuan / metro_t0
View on GitHub
Code repo for "Model-Generated Pretraining Signals Improves Zero-Shot Generalization of Text-to-Text Transformers" (ACL 2023)
☆22Nov 1, 2023Updated 2 years ago
msclar / symbolictom
View on GitHub
☆23Nov 8, 2023Updated 2 years ago
ekinakyurek / gpt3-arithmetic
View on GitHub
Scratchpad/Chain-of-Thought Prompts
☆12Jun 6, 2022Updated 4 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
readme-generator / alreadyme-ai-serving
View on GitHub
Serving large language model with transformers
☆13Oct 18, 2022Updated 3 years ago
coverist / coverist-android
View on GitHub
커버리스트 - 북 커버 생성 AI 서비스
☆13Sep 11, 2022Updated 3 years ago
amirzandieh / HyperAttention
View on GitHub
Triton Implementation of HyperAttention Algorithm
☆48Dec 11, 2023Updated 2 years ago
alirezamshi / small100
View on GitHub
Implementation of "SMaLL-100: Introducing Shallow Multilingual Machine Translation Model for Low-Resource Languages" paper, accepted to E…
☆30Feb 8, 2023Updated 3 years ago
e-bug / cross-modal-ablation
View on GitHub
[EMNLP 2021] Code and data for our paper "Vision-and-Language or Vision-for-Language? On Cross-Modal Influence in Multimodal Transformers…
☆20Jan 17, 2022Updated 4 years ago
Heidelberg-NLP / discourse-aware-semantic-self-attention
View on GitHub
Repository for code and data from the EMNLP-IJCNLP 2019 paper "Discourse-aware Semantic Self-Attention for Narrative Reading Comprehensio…
☆17Jul 25, 2024Updated last year
Alab-NII / onecommon
View on GitHub
☆18Sep 12, 2021Updated 4 years ago
neukg / KAT-TSLF
View on GitHub
Source code of paper “A Novel Three-Stage Learning Framework for Low-Resource Knowledge-Grounded Dialogue Generation”
☆16Nov 25, 2021Updated 4 years ago
affjljoo3581 / KW-Computer-Vision-AI-1st-Solution
View on GitHub
광운대학교 컴퓨터 비전 AI 경진대회 1등 솔루션입니다.
☆15Oct 5, 2022Updated 3 years ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
LeonEricsson / llmjudge
View on GitHub
Exploring limitations of LLM-as-a-judge
☆20Aug 17, 2024Updated last year
tlkh / t2t-tuner
View on GitHub
Convenient Text-to-Text Training for Transformers
☆18Dec 10, 2021Updated 4 years ago
vrjkmr / arxiv-topic
View on GitHub
Detecting topic clusters in arXiv ML papers.
☆14Oct 10, 2020Updated 5 years ago
ash-neupane / multi-token-pred
View on GitHub
Train toy models using multi-token prediction objective
☆14Apr 18, 2026Updated 3 months ago
Avmb / inverse_scaling_prize_code_identifier_swap
View on GitHub
Submission to the inverse scaling prize
☆23Jul 23, 2023Updated 2 years ago
JacobPfau / fillerTokens
View on GitHub
☆76Apr 27, 2024Updated 2 years ago
sanyalsunny111 / Early_Weight_Avg
View on GitHub
[COLM 2024] Early Weight Averaging meets High Learning Rates for LLM Pre-training
☆19Oct 12, 2024Updated last year
explanare / ravel
View on GitHub
Evaluate interpretability methods on localizing and disentangling concepts in LLMs.
☆58Oct 30, 2025Updated 8 months ago
fastai / cards_deck
View on GitHub
A minimal example of nbdev based on Allen Downey's Think Python 2nd Ed
☆11Jul 29, 2022Updated 3 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
EleutherAI / mdl
View on GitHub
Minimum Description Length probing for neural network representations
☆20Jan 28, 2025Updated last year
peterbhase / SLAG-Belief-Updating
View on GitHub
Code for paper "Do Language Models Have Beliefs? Methods for Detecting, Updating, and Visualizing Model Beliefs"
☆28May 2, 2022Updated 4 years ago
GeneralUserModels / napsack
View on GitHub
☆16Apr 4, 2026Updated 3 months ago
affjljoo3581 / Google-American-Sign-Language-Fingerspelling-Recognition
View on GitHub
🎖️ 5th place solution in the Google American Sign Language Fingerspelling Recognition Competition🎖️
☆16Sep 19, 2023Updated 2 years ago
lovit / petitions_archive
View on GitHub
청와대 국민청원 데이터 아카이브
☆16Aug 29, 2020Updated 5 years ago
machelreid / m2d2
View on GitHub
M2D2: A Massively Multi-domain Language Modeling Dataset (EMNLP 2022) by Machel Reid, Victor Zhong, Suchin Gururangan, Luke Zettlemoyer
☆54Nov 21, 2022Updated 3 years ago
Open-Knowledge-Korea / covid-19-our-memory
View on GitHub
한국의 COVID-19에 대한 한국 사회의 대응 및 데이터 기반 사회문화적 이슈의 분석
☆22May 16, 2022Updated 4 years ago