An open-source package for python to clean raw text data
☆79Aug 8, 2023Updated 2 years ago
Alternatives and similar repositories for cleantext
Users that are interested in cleantext are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Python parser for the Archie Markup Language (ArchieML)☆12Nov 7, 2021Updated 4 years ago
- Scripts for KGIRNet model for ESWC☆10Jul 6, 2023Updated 2 years ago
- Jupyter Notebook Scientific Python Stack extension for Docker Desktop☆18Mar 26, 2024Updated 2 years ago
- A small repo of notes and scripts for collecting data on U.S. deadly force police incidents☆10Aug 9, 2015Updated 10 years ago
- This repository includes all the code and data for the paper ELiDi (End2end Entity Linking and Disambiguation)☆14Jul 18, 2021Updated 4 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- 🧹 Python package for text cleaning☆1,020May 15, 2026Updated 3 weeks ago
- US election metadata, packaged as python!☆10Mar 16, 2022Updated 4 years ago
- ☆16Nov 5, 2018Updated 7 years ago
- Dynamic Topic Modeling and Topic Chains of Reuters News Articles using SCVB0☆24Jan 12, 2017Updated 9 years ago
- All things involving lavaan☆14Mar 13, 2013Updated 13 years ago
- All the goto functions you need to handle NLP use-cases, integrated in NLPretext☆143Mar 24, 2025Updated last year
- CommonsenseQA☆10Mar 28, 2020Updated 6 years ago
- ☆12Jun 3, 2021Updated 5 years ago
- finds a different set of words that sound like the input☆10Feb 24, 2022Updated 4 years ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- A lightweight Python script that fetches data from a Google spreadsheet, transforms to JSON, then optionally commits a data file to a Git…☆10Apr 1, 2026Updated 2 months ago
- PyTorch code for NAACL 2022 paper: DialoKG: Knowledge-Structure Aware Task-Oriented Dialogue Generation (https://aclanthology.org/2022.fi…☆16Apr 21, 2026Updated last month
- Sharing a viewer we built for WNYC.☆12May 10, 2011Updated 15 years ago
- A Los Angeles Times analysis of water usage after the state eased drought restrictions☆12Mar 19, 2021Updated 5 years ago
- yeoman generator for newsapps.☆15Jun 3, 2015Updated 11 years ago
- Jupyter notebooks - A tool to write and share executable notebooks and data visualization☆10Feb 5, 2026Updated 4 months ago
- A demo project and template repository showing how I use SpatiaLite with Datasette for quick spatial analysis.☆17Jul 7, 2024Updated last year
- Data and scripts for examining the Department of Defense's 1033 excess equipment program☆16Jun 21, 2022Updated 3 years ago
- Allows users to instantly reveal who donated to any current lawmakers☆10Jun 18, 2015Updated 10 years ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- ☆14Jul 18, 2024Updated last year
- Notes and activity code for the "Python 3: Data cleaning and visualization with pandas and matplotlib" session at the 2018 NICAR conferen…☆11Jun 1, 2021Updated 5 years ago
- This library is for display the XAML code of theme library for WPF (e.g. MaterialDesignInXamlToolkit)☆12Sep 6, 2017Updated 8 years ago
- Visualising Sydney bus congestion with Marey charts☆13Nov 23, 2022Updated 3 years ago
- A Python module to convert natural language numerics into ints and floats.☆233Sep 26, 2024Updated last year
- Code for extracting data from a large number of PDFs, particularly FCC political ad documents☆15Oct 26, 2017Updated 8 years ago
- Referring expression comprehension on ReferIt(RefClef)☆10Nov 28, 2016Updated 9 years ago
- Create styles and themes for your Python desktop applications☆16Mar 21, 2022Updated 4 years ago
- Create Hilbert curves in ggplot2☆15Dec 23, 2025Updated 5 months ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- ☆11May 31, 2024Updated 2 years ago
- spaCy-to-naf converter☆21Jun 10, 2025Updated last year
- Matplotlib Image labeller for classifying images☆11Apr 6, 2026Updated 2 months ago
- A Flexible Deep Learning Approach to Fuzzy String Matching☆151Oct 16, 2024Updated last year
- Demo files for the Power BI Dev camp session on developing Azure Functions for PowerBI☆10Jun 30, 2022Updated 3 years ago
- An exploratory visualization tool for the analysis of movements between geographic locations☆13Dec 9, 2022Updated 3 years ago
- QGIS Plugin for Cesium ion☆17Jul 14, 2025Updated 10 months ago