pmandera / duometerLinks
Near-duplicate detection tool
☆24Updated 8 years ago
Alternatives and similar repositories for duometer
Users that are interested in duometer are comparing it to the libraries listed below
Sorting:
- Elwha is a Java application for monitoring topics, sentiment and events on Twitter streams with the ability to generate notification mess…☆16Updated 10 years ago
- A platform for collecting, analyzing, and visualizing social media data.☆12Updated 4 years ago
- Topic modeling web application☆40Updated 10 years ago
- stav text annotation visualiser☆34Updated 14 years ago
- Vizlinc☆15Updated 9 years ago
- Tools for tracking stories on news homepages☆48Updated 6 years ago
- ☆14Updated 4 years ago
- ☆48Updated 11 years ago
- General Architecture for Text Engineering☆49Updated 9 years ago
- Scraper built with Scrapy.☆18Updated last year
- Raw Wikipedia counts for entity linking☆19Updated 8 years ago
- A pipeline for crawling of RSS feeds and the associated content. Demo at newsfeed.ijs.si.☆21Updated 12 years ago
- Easily identify and label sentence intervals using various taggers.☆16Updated 8 years ago
- An OpenCalais API Interface for Python.☆20Updated 13 years ago
- Tools for bulk indexing of WARC/ARC files on Hadoop, EMR or local file system.☆46Updated 7 years ago
- Quickly analyze and explore email with advanced analytics and visualization.☆56Updated 4 years ago
- Literate data analysis with iPython notebooks and Jekyll.☆92Updated 11 years ago
- A queue-controlled browser automation tool for improving web crawl quality☆63Updated 2 months ago
- Parser for KAF NAF files written in Python☆16Updated 4 years ago
- JavaScript based graph visualization library with emphasis on customization and modularity.☆13Updated 6 years ago
- A semantic analysis tool to generate synonym.txt files for Solr. [RETIRED]☆25Updated 9 years ago
- Schemas to convert common fixed-width file formats into CSV using in2csv.☆125Updated 4 years ago
- extensible Web Retrieval Toolkit☆17Updated 3 years ago
- Slinky, a high-performance web crawler / text analytics in Python, Redis, Hadoop, R, Gephi☆41Updated 15 years ago
- Automated NLP sentiment predictions- batteries included, or use your own data☆18Updated 7 years ago
- Command line tool to convert spreadsheets to databases, made for the UK's Office for National Statistics.☆80Updated last year
- Navigating around a grid of cells like XPath for spreadsheets; supports Python 3.5+☆48Updated 2 years ago
- Check out https://github.com/webrecorder/webrecorder for newer version matching https://webrecorder.io☆38Updated 10 years ago
- Python library and command line tool for converting data from one format to another☆99Updated 5 years ago
- A set of distinct value estimators that give probabilistic bounds on a sets cardinality☆22Updated 5 years ago