commonsearch / gumbocy
Python binding for gumbo-parser using Cython
☆14Updated 8 years ago
Related projects: ⓘ
- mltk - Moz Language Tool Kit☆12Updated 9 years ago
- Search 'from' and 'to' strings to learn a text cleaning mapping☆17Updated 9 years ago
- Modularly extensible semantic metadata validator☆83Updated 8 years ago
- Extract, parse and populate templates from strings☆27Updated 5 years ago
- Manage and load dataprotocols.org Data Packages☆27Updated 9 years ago
- Demo code for learning_text_transformer☆25Updated 9 years ago
- ☆17Updated 7 years ago
- Find which links on a web page are pagination links☆29Updated 7 years ago
- a set of services that provide NLP facilities☆25Updated 3 years ago
- A repository for the "Combining DBpedia and Topic Modeling" GSoC 2016 idea☆13Updated 8 years ago
- Hidden alignment conditional random field for classifying string pairs.☆25Updated this week
- common data interchange format for document processing pipelines that apply natural language processing tools to large streams of text☆34Updated 7 years ago
- An index data structure for approximate string search.☆23Updated 5 years ago
- Traptor -- A distributed Twitter feed☆26Updated 2 years ago
- SMART-Learner is a machine learning library built with researchers in mind.☆10Updated 8 years ago
- A python implementation of DEPTA☆83Updated 7 years ago
- code and slides for my PyGotham 2016 talk, "Higher-level Natural Language Processing with textacy"☆15Updated 8 years ago
- High Level Kafka Scanner☆19Updated 6 years ago
- ArchiveKit manages data and documents during ETL processes, either on a local file system or on S3.☆15Updated 9 years ago
- ☆15Updated this week
- Updates to Zope's keyphrase extractor (forked from 1.1.0)☆67Updated 7 years ago
- A component that tries to avoid downloading duplicate content☆27Updated 6 years ago
- Multidimensional data explorer and visualization tool.☆52Updated 7 years ago
- python abstraction for key-value databases with key-prefix scans that supports Accumulo, HBase, Postgres, Mysql, Redis, and more. Can sw…☆8Updated 9 years ago
- Algorithms for "schema matching"☆25Updated 8 years ago
- A simple command line interface to the datamade/dedupe library.☆42Updated last year
- ☆22Updated this week
- Entity Linking for the masses☆56Updated 8 years ago
- ☆14Updated this week
- A disk-based key/value store in Python with no dependencies.☆21Updated 9 years ago