Pleias / toxic-commonsView external linksLinks
The official repository for Toxic Commons and Celadon. Toxicity Classification for public domain data.
☆22Nov 10, 2024Updated last year
Alternatives and similar repositories for toxic-commons
Users that are interested in toxic-commons are comparing it to the libraries listed below
Sorting:
- Small python package to measure OCR quality and other related metrics.☆27Feb 19, 2024Updated last year
- PathPiece tokenizer☆13Nov 10, 2024Updated last year
- German Language Understanding Evaluation Benchmark @NAACL24☆22Dec 11, 2025Updated 2 months ago
- ☆18Feb 25, 2025Updated 11 months ago
- Repositorio general para Bootcamps de Data Science en Coding Dojo☆11Nov 13, 2025Updated 3 months ago
- Make the Best of Cross-lingual Transfer: Evidence from POS Tagging with over 100 Languages (ACL 2022)☆19May 17, 2022Updated 3 years ago
- ✂️ Sentence segmentation with wtpsplit's state-of-the-art Segment any Text (SaT) models☆35Oct 1, 2025Updated 4 months ago
- Data for the HIPE 2022 shared task.☆21Nov 29, 2023Updated 2 years ago
- An example of how to use spaCy for extremely large files without running into memory issues☆36Sep 17, 2022Updated 3 years ago
- ☆10Sep 13, 2025Updated 5 months ago
- Code for the ACL 2022 paper "Contextual Representation Learning beyond Masked Language Modeling"☆33Oct 23, 2022Updated 3 years ago
- The production website for SquiggleConf: a conference for excellent web dev tooling☆11Jan 27, 2026Updated 2 weeks ago
- Development of a board that receives GPIO input and allows on/off control of connected solenoids.☆12Jan 13, 2026Updated last month
- ☆53Feb 10, 2025Updated last year
- Code Roberta version of RetroMAE: Pre-Training Retrieval-oriented Language Models Via Masked Auto-Encoder☆10Mar 16, 2023Updated 2 years ago
- A package to simplify integration of language models into Unity.☆16Oct 14, 2025Updated 4 months ago
- My favorite GNU/Linux flavor on the Microsoft Surface Duo.☆10Feb 7, 2024Updated 2 years ago
- ☆10Oct 2, 2024Updated last year
- a low cost 3d lidar based on ydlidar-x4☆11Apr 10, 2021Updated 4 years ago
- One Initialization to Rule them All: Fine-tuning via Explained Variance Adaptation☆48Oct 20, 2025Updated 3 months ago
- This project showcases the use of an ESP32-CAM with micro-ROS and an OLED screen to capture an image of a hand. The captured image is enc…☆10Sep 24, 2024Updated last year
- The UKWA Heritrix3 custom modules and Docker builder.☆11Dec 2, 2024Updated last year
- An agent-based model for scientific inquiry based on abstract argumentation☆13Jan 17, 2022Updated 4 years ago
- Browser detection based on feature inspection☆27Feb 2, 2026Updated last week
- 0-Shot Tokenizer Transplant☆14May 16, 2025Updated 8 months ago
- A patchless architecture, based on MLP-Mixer☆18Dec 30, 2021Updated 4 years ago
- OCaml PPX extension for automatically generating Irmin types☆11Jan 14, 2020Updated 6 years ago
- Ranger helps you see the forest among the trees - Ranger is an effect-size meta analysis library creating beautiful forest plots!☆11Jun 12, 2023Updated 2 years ago
- The Modern ROS IDE☆36Dec 21, 2025Updated last month
- Benchmarks for Evaluating Spanish Language Models☆11Jun 14, 2023Updated 2 years ago
- Collection of description of concepts, procedures, and simple XSLT files for text processing, e.g. simplify InDesign documents (.idml) to…☆12Jan 9, 2020Updated 6 years ago
- Shallow baseline models for text in TensorFlow☆12Jul 1, 2017Updated 8 years ago
- ☆12Feb 3, 2026Updated last week
- SciKit Sequitur is an Apache2 licensed Python module for inferring compositional hierarchies from sequences.☆10Oct 13, 2021Updated 4 years ago
- 🌴 Small javascript utilities.☆14Nov 28, 2023Updated 2 years ago
- Digitale Geisteswissenschaften rund um Graphentechnologien☆10Updated this week
- Built a system from scratch in Python which can detect spelling and grammatical errors in a word and sentence respectively using N-gram b…☆15Jul 4, 2021Updated 4 years ago
- Generate cryptographically secure, memorable passphrases.☆13Jun 4, 2020Updated 5 years ago
- A Bayesian model for time-series count data with weekend effects and a lagged reporting process☆10Mar 7, 2022Updated 3 years ago