The official repository for Toxic Commons and Celadon. Toxicity Classification for public domain data.
☆22Nov 10, 2024Updated last year
Alternatives and similar repositories for toxic-commons
Users that are interested in toxic-commons are comparing it to the libraries listed below
Sorting:
- Small python package to measure OCR quality and other related metrics.☆27Feb 19, 2024Updated 2 years ago
- PathPiece tokenizer☆13Nov 10, 2024Updated last year
- ☆18Feb 25, 2025Updated last year
- German Language Understanding Evaluation Benchmark @NAACL24☆22Dec 11, 2025Updated 2 months ago
- Repositorio general para Bootcamps de Data Science en Coding Dojo☆11Nov 13, 2025Updated 3 months ago
- Make the Best of Cross-lingual Transfer: Evidence from POS Tagging with over 100 Languages (ACL 2022)☆19May 17, 2022Updated 3 years ago
- ✂️ Sentence segmentation with wtpsplit's state-of-the-art Segment any Text (SaT) models☆36Oct 1, 2025Updated 5 months ago
- Data for the HIPE 2022 shared task.☆21Nov 29, 2023Updated 2 years ago
- An example of how to use spaCy for extremely large files without running into memory issues☆36Sep 17, 2022Updated 3 years ago
- ☆10Sep 13, 2025Updated 5 months ago
- Code for the ACL 2022 paper "Contextual Representation Learning beyond Masked Language Modeling"☆33Oct 23, 2022Updated 3 years ago
- ☆16Jun 22, 2022Updated 3 years ago
- ☆10Oct 2, 2024Updated last year
- Linear Attention for Efficient Bidirectional Sequence Modeling☆15May 13, 2025Updated 9 months ago
- ☆53Feb 10, 2025Updated last year
- maps are everything.☆10Jul 3, 2025Updated 8 months ago
- A CardDAV to IP phones converter for Node.js (AVM FRITZ!Box, Snom XCAP, Yealink)☆14Sep 30, 2025Updated 5 months ago
- Educational robot with support for various drive kinematics☆13Nov 23, 2025Updated 3 months ago
- The production website for SquiggleConf: a conference for excellent web dev tooling☆11Jan 27, 2026Updated last month
- A package to simplify integration of language models into Unity.☆16Oct 14, 2025Updated 4 months ago
- Development of a board that receives GPIO input and allows on/off control of connected solenoids.☆12Jan 13, 2026Updated last month
- One Initialization to Rule them All: Fine-tuning via Explained Variance Adaptation☆48Oct 20, 2025Updated 4 months ago
- HyPe: Better Pre-trained Language Model Fine-tuning with Hidden Representation Perturbation [ACL 2023]☆14Jul 11, 2023Updated 2 years ago
- suffix array construction and searching algorithms for in-memory binary data.☆12Sep 10, 2022Updated 3 years ago
- ROS2 stack for the Sowbot open hardware reference platforms. We're working towards a precision-guided seeding/weeding robot.☆40Feb 27, 2026Updated last week
- A directory of Fediverse users working to make government and democracy better.☆13May 21, 2023Updated 2 years ago
- KnowMAN: Weakly Supervised Multinomial Adversarial Networks☆12Nov 9, 2021Updated 4 years ago
- This project showcases the use of an ESP32-CAM with micro-ROS and an OLED screen to capture an image of a hand. The captured image is enc…☆10Sep 24, 2024Updated last year
- decontamination☆26Dec 3, 2025Updated 3 months ago
- Browser detection based on feature inspection☆27Feb 16, 2026Updated 2 weeks ago
- Realtime robot data visualization in the browser☆12Jan 30, 2022Updated 4 years ago
- OCCA Python API: JIT Compilation for Multiple Architectures☆11Dec 20, 2019Updated 6 years ago
- Random forests for longitudinal data using stochastic semiparametric miced-model☆11May 15, 2022Updated 3 years ago
- ☆10Dec 17, 2020Updated 5 years ago
- Collection of description of concepts, procedures, and simple XSLT files for text processing, e.g. simplify InDesign documents (.idml) to…☆12Jan 9, 2020Updated 6 years ago
- ☆11Dec 17, 2021Updated 4 years ago
- Digitale Geisteswissenschaften rund um Graphentechnologien☆10Feb 12, 2026Updated 3 weeks ago
- LLM-aided data filtering☆14Dec 3, 2024Updated last year
- our modeling of online misogyny☆11Jun 22, 2022Updated 3 years ago