ruyimarone/data-portraits

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/ruyimarone/data-portraits)

ruyimarone / data-portraits

Documenting large text datasets 🖼️ 📚

☆14

Alternatives and similar repositories for data-portraits

Users that are interested in data-portraits are comparing it to the libraries listed below

Sorting:

mireshghallah / ft-memorization
View on GitHub
☆13Oct 20, 2022Updated 3 years ago
boyiwei / CoTaEval
View on GitHub
[NeurIPS 2024 D&B] Evaluating Copyright Takedown Methods for Language Models
☆17Jul 17, 2024Updated last year
EleutherAI / elk-generalization
View on GitHub
Investigating the generalization behavior of LM probes trained to predict truth labels: (1) from one annotator to another, and (2) from e…
☆28May 23, 2024Updated last year
julian-risch / toxic-comment-collection
View on GitHub
Code for our WOAH@ACL 2021 Paper on Data Integration for Toxic Comment Classification: Making More Than 40 Datasets Easily Accessible in …
☆30Nov 25, 2021Updated 4 years ago
theopensystemslab / planx-new
View on GitHub
Plan✕ is a platform for creating and publishing digital planning services
☆17Updated this week
JHU-CLSP / ettin-encoder-vs-decoder
View on GitHub
State-of-the-art paired encoder and decoder models (17M-1B params)
☆59Aug 6, 2025Updated 7 months ago
Sunscreen-tech / spf
View on GitHub
This repository contains the Parasol processor, which enables next-generation privacy preserving applications. Users can run arbitrary co…
☆11Feb 25, 2026Updated last week
mengzili / jemdoc-python3
View on GitHub
An unofficial Python 3 version of jemdoc.
☆11Feb 8, 2026Updated 3 weeks ago
Emperor-WS / PyEmber
View on GitHub
An Educational Framework Based on PyTorch for Deep Learning Education and Exploration
☆10Dec 24, 2023Updated 2 years ago
zh1yu4nyu / CodeIPPrompt
View on GitHub
https://icml.cc/virtual/2023/poster/24354
☆10Aug 15, 2023Updated 2 years ago
LCS2-IIITD / quarc-counterspeech
View on GitHub
[ACL 2023] Counterspeeches up my sleeve! Intent Distribution Learning and Persistent Fusion for Intent-Conditioned Counterspeech Generati…
☆10Sep 23, 2023Updated 2 years ago
Greysahy / ipiguard
View on GitHub
[EMNLP 2025 Oral] IPIGuard: A Novel Tool Dependency Graph-Based Defense Against Indirect Prompt Injection in LLM Agents
☆16Sep 16, 2025Updated 5 months ago
Aurora-cx / EmotionCircuits-LLM
View on GitHub
EmotionCircuits-LLM: A complete, reproducible framework for discovering and controlling emotion circuits in large language models.
☆25Oct 20, 2025Updated 4 months ago
Brett-z / LayerEditing
View on GitHub
A Model Agnostic function to directly remove specified layers from the LLM
☆10May 23, 2024Updated last year
GuillaumeBriffoteaux / pySBO
View on GitHub
Python platform for parallel Surrogate-Based Optimization
☆12Nov 27, 2024Updated last year
ottowg / gsap-ner
View on GitHub
☆10Oct 2, 2024Updated last year
smitkiri / news-qa
View on GitHub
Reading comprehension based question-answering model for news articles.
☆11Jun 22, 2022Updated 3 years ago
ronakdm / ml-interviews
View on GitHub
Guide to interviewing for industry machine learning roles (data/applied/research scientist, ML engineer, etc).
☆11Dec 28, 2022Updated 3 years ago
ul-fmf / mlfmf-data
View on GitHub
Machine Learning for Mathematical Formalization
☆11Jul 20, 2024Updated last year
ihcr / learning_to_adapt
View on GitHub
☆15Sep 7, 2025Updated 6 months ago
control-toolbox / CTDirect.jl
View on GitHub
Direct transcription of an optimal control problem and resolution
☆12Updated this week
ariahw / rl-rewardhacking
View on GitHub
☆24Feb 18, 2026Updated 2 weeks ago
oceanumeric / EnteRAG
View on GitHub
A RAG that can scale 🧑🏻‍💻
☆11May 28, 2024Updated last year
iamgroot42 / mimir
View on GitHub
Python package for measuring memorization in LLMs.
☆183Jul 16, 2025Updated 7 months ago
LostOxygen / llm-confidentiality
View on GitHub
Whispers in the Machine: Confidentiality in Agentic Systems
☆41Dec 11, 2025Updated 2 months ago
RUCBM / ICLEval
View on GitHub
☆14Jun 24, 2024Updated last year
orf / pytest-scrutinize
View on GitHub
Find bottlenecks in your test suites
☆17Updated this week
drozzy / reinforce
View on GitHub
Implementation of Reinforce for educational purposes.
☆12Jun 12, 2023Updated 2 years ago
Exawind / exawind-driver
View on GitHub
Driver for coupled AMR-Wind/Nalu-Wind simulations
☆13Nov 10, 2025Updated 3 months ago
conda-forge / ollama-feedstock
View on GitHub
A conda-smithy repository for ollama.
☆10Updated this week
haileyschoelkopf / triton-index
View on GitHub
See https://github.com/cuda-mode/triton-index/ instead!
☆11May 8, 2024Updated last year
znah / tt09
View on GitHub
☆15Jun 30, 2025Updated 8 months ago
carterworks / substack-to-epub
View on GitHub
Given a Substack newsletter, save the contents into an sqlite db and format it as an epub
☆13Jan 11, 2024Updated 2 years ago
allenai / neurodiscoverybench
View on GitHub
☆16Jan 29, 2026Updated last month
textiles-lab / fenced-tangle-supplemental
View on GitHub
☆14Dec 12, 2023Updated 2 years ago
jmanhype / ace-adaptive-code-evolution
View on GitHub
ACE (Adaptive Code Evolution) is an AI-powered system for code analysis and optimization.
☆12Nov 4, 2025Updated 4 months ago
ademakdogan / plant_detector
View on GitHub
PlantDetector provides easy development (training and prediction) for object detection. DETR (End-to-End Object Detection with Transforme…
☆11Aug 1, 2022Updated 3 years ago
LinxinS97 / NLPBench
View on GitHub
NLPBench: Evaluating NLP-Related Problem-solving Ability in Large Language Models
☆10Oct 27, 2023Updated 2 years ago
johnsyweb / python_sparse_list
View on GitHub
A list where most values will be None (or default)
☆11Jul 19, 2023Updated 2 years ago