Dataiku DSS plugin to detect languages, correct misspellings, and clean text data π§Ό
β22Jan 29, 2026Updated 3 months ago
Alternatives and similar repositories for dss-plugin-nlp-preparation
Users that are interested in dss-plugin-nlp-preparation are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A python library to generate highly realistic typos (fuzz-testing)β13Mar 16, 2025Updated last year
- Automatically extracts information from pictures of receipts.β10Mar 22, 2019Updated 7 years ago
- Stroke-based Character Reconstruction ---> https://arxiv.org/abs/1806.08990β15Dec 6, 2021Updated 4 years ago
- Self-collected data for Masked Face recognition paper (300+ different participants)β12Jul 13, 2023Updated 2 years ago
- Vietnamese spelling correction (ViSC) toolβ12Dec 11, 2016Updated 9 years ago
- Deploy on Railway without the complexity - Free Credits Offer β’ AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- β17Jul 10, 2022Updated 3 years ago
- Add accent for Vietnamese. N-Grams + Beam search, LSTM, Transformer, Evolved Transformerβ18Feb 3, 2021Updated 5 years ago
- β20Nov 4, 2022Updated 3 years ago
- fastlangid, the only language identification package that support cantonese (zh-yue), simplified (zh-hans) and traditional chinese (zh-haβ¦β43Dec 6, 2022Updated 3 years ago
- Download images including URLs from Google, Bing, Flickr and Instagram hashtags with given keywordβ24May 14, 2022Updated 4 years ago
- β12Jun 3, 2021Updated 4 years ago
- BERT models pretrained on the CORD-19 Kaggle datasetβ15Jun 8, 2020Updated 5 years ago
- This repo contains my works on the area of NLP, such as Neural Machine Translation, Named Entity Recognition etc,.β13Sep 19, 2020Updated 5 years ago
- Omnipy is a high level Python library for type-driven data wrangling and scalable workflow orchestration (under development)β26May 19, 2026Updated last week
- Wordpress hosting with auto-scaling - Free Trial Offer β’ AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Vietnamese ID information detectionβ19Jun 24, 2022Updated 3 years ago
- Bash script to create an ebook from a list of web articles. Inspired by the now-defunct Readlists.org by Readabilityβ18Oct 13, 2019Updated 6 years ago
- Crawling engine that crawls a set of top-level domains looking for documents in a list of languagesβ11Feb 6, 2024Updated 2 years ago
- My OpenCode and Oh-My-OpenCode configuration files with API proxy setup documentationβ36Jan 5, 2026Updated 4 months ago
- Command line tool and async library to perform basic file operations on local paths, Google Cloud Storage paths and Azure Blob Storage paβ¦β39Apr 7, 2026Updated last month
- Website for the KGC 2020 Tutorial: "Building a Knowledge Graph from schema.org annotations"β10Jun 26, 2020Updated 5 years ago
- Simple web code editor build with web components librariesβ15Oct 12, 2023Updated 2 years ago
- Code and Word2Vec embeddings of LOINC codes for KDD 2019 DSHealth paper "Evaluation of Embeddings of Laboratory Test Codes for Patients aβ¦β11Jun 13, 2024Updated last year
- Remark plugin for selecting and storing code blocks from markdown.β18Dec 7, 2022Updated 3 years ago
- GPUs on demand by Runpod - Special Offer Available β’ AdRun AI, ML, and HPC workloads on powerful cloud GPUsβwithout limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- Automatically exported from code.google.com/p/hunposβ12Apr 9, 2018Updated 8 years ago
- β10Oct 15, 2020Updated 5 years ago
- The best Python package for comparing two dataframesβ12Dec 29, 2021Updated 4 years ago
- LaTeXDataHub is an open-source platform dedicated to the sharing and contribution of real-world LaTeX image datasets and their annotationβ¦β12Aug 13, 2024Updated last year
- β11Apr 8, 2022Updated 4 years ago
- Super simple, zero config options, <2kb declarative tooltip library with no dependencies.β17Jun 2, 2023Updated 2 years ago
- Text Augmentation for Machine Learning tasks. Small data: How to grow your text dataset for classification ?β22Jan 18, 2019Updated 7 years ago
- A Bio2BEL package for DrugBank (https://www.drugbank.ca)β10Dec 14, 2020Updated 5 years ago
- A WordPress plugin to receive movie/series information, including poster and trailer from IMDB.β10May 21, 2017Updated 9 years ago
- Managed Kubernetes at scale on DigitalOcean β’ AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Force Users to upload profile photo before they can use the site.β10Dec 17, 2017Updated 8 years ago
- Design your Material-UI buttons, add clickable hyperlinks, integrate them in your Streamlit apps! πβ10Jun 17, 2022Updated 3 years ago
- Library and examples to interface a HPGL plotter such as HP7550a to processing.β10Jan 15, 2015Updated 11 years ago
- β12Oct 24, 2025Updated 7 months ago
- Safely running potentially non-terminating functions in Elm.β10Apr 20, 2021Updated 5 years ago
- Neural Sentiment Analyzer for Modern Hebrewβ21Nov 21, 2022Updated 3 years ago
- Python utility to extract differences between two pandas dataframes.β11Apr 4, 2026Updated last month