a python package for cleaning Gutenberg books and dataset
☆35May 2, 2025Updated last year
Alternatives and similar repositories for gutenberg_cleaner
Users that are interested in gutenberg_cleaner are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Code used for the paper "Linguistic Features for Readability Assessment" (Deutsch, Jasbi, and Shieber 2020)☆25Jul 19, 2021Updated 4 years ago
- POS tagging models for Hindi English Code Mixed Tweets☆11Aug 1, 2018Updated 7 years ago
- Extracting Cultural Commonsense Knowledge at Scale (WWW 2023)☆11Feb 15, 2024Updated 2 years ago
- A text readability reading list maintained by BLCU ICALL Research Group☆13Mar 27, 2020Updated 6 years ago
- Json encoder and decoder for Common-Lisp☆13Feb 4, 2024Updated 2 years ago
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- Repo of the Turing's Humanities & Data Science Discussion Group☆13Jul 21, 2022Updated 3 years ago
- This repository includes all the code and data for the paper ELiDi (End2end Entity Linking and Disambiguation)☆14Jul 18, 2021Updated 4 years ago
- Code accompanying our paper at AISTATS 2020☆21Jan 12, 2021Updated 5 years ago
- This repository contains supplementary material for the book "Representation learning: propositionalization and embeddings"☆19Nov 16, 2022Updated 3 years ago
- Repository for Vajjala & Lucic (2018)☆69Feb 15, 2024Updated 2 years ago
- Introduction to AI for GLAM☆20Feb 6, 2026Updated 3 months ago
- Python interface for LegiScan API☆22Jan 18, 2015Updated 11 years ago
- Combination of the RapidFuzz library with Spacy PhraseMatcher☆11Sep 29, 2021Updated 4 years ago
- Showcase notebooks for getML☆19Jan 20, 2026Updated 3 months ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- ☆14Mar 9, 2023Updated 3 years ago
- Code used to run experiments for the ICLR 2023 paper "Computational Language Acquisition with Theory of Mind".☆15Apr 27, 2023Updated 3 years ago
- A collection of Tensorflow implementations of embeddings for entities.☆10Apr 25, 2019Updated 7 years ago
- Surprisal calculation using HuggingFace LMs ("Frequency Explains the Inverse Correlation of Large Language Models’ Size, Training Data Am…☆22Mar 7, 2024Updated 2 years ago
- R package PCAtest for evaluating the statistical significance of PCA analysis, selecting number of significant PC axes, and testing the c…☆22Oct 18, 2024Updated last year
- ☆11Sep 27, 2024Updated last year
- Funny text generation using character level LSTM model, featured in TED ideas☆24Sep 21, 2018Updated 7 years ago
- ☆13Feb 8, 2024Updated 2 years ago
- Feature Selection using Simulated Annealing☆11Aug 10, 2022Updated 3 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- A compute framework for building Search, RAG, Recommendations and Analytics over complex (structured+unstructured) data, with ultra-modal…☆11Sep 16, 2024Updated last year
- ☆12Mar 8, 2024Updated 2 years ago
- MFTE (Multi Feature Tagger of English) Python is the Python version based on Le Foll's MFTE written in Perl. It is extended to include se…☆30Feb 21, 2026Updated 2 months ago
- Code and data used for participation in SemEval-2018 Task 3: "Irony detection in English tweets"☆17Mar 5, 2020Updated 6 years ago
- An Easy Annotation Tool for Natural Language Processing☆11May 17, 2024Updated last year
- [ONGOING] ACM ICPC Handbook for Algorithms and Data Structures☆24Oct 25, 2020Updated 5 years ago
- Command Line Interface for IA² models development, training and deployment.☆10Jun 16, 2023Updated 2 years ago
- Freeling wrapper☆12Jun 27, 2016Updated 9 years ago
- A python implementation of discrete optimal transport with a Tsallis entropy regularization.☆14Oct 23, 2023Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Template repository and README for submissions to Bellingcat's Global Hackathon☆16Oct 7, 2022Updated 3 years ago
- A Python toolkit converting pronunciation in enwiktionary xml dump to cmudict format☆33Jul 5, 2019Updated 6 years ago
- ☆26Sep 14, 2025Updated 7 months ago
- ln2sql as a python package☆17Aug 20, 2019Updated 6 years ago
- ☆16Aug 6, 2023Updated 2 years ago
- Experimental Git Mirror of "https://sourceforge.net/p/lemur/galago" using "https://github.com/felipec/git-remote-hg"☆13Dec 17, 2020Updated 5 years ago
- Visualization of topics in a document (documents), aimed to replace word cloud☆19May 10, 2016Updated 9 years ago