VIDA-NYU / domain_discovery_tool_deprecated
Seed acquisition tool to bootstrap focused crawlers
☆23Updated 7 years ago
Alternatives and similar repositories for domain_discovery_tool_deprecated:
Users that are interested in domain_discovery_tool_deprecated are comparing it to the libraries listed below
- Topic modeling web application☆40Updated 9 years ago
- [UNMAINTAINED] Firefox addon for Scrapely☆5Updated 9 years ago
- ☆43Updated 9 years ago
- Viewers for statistics and dashboarding of Domain Search Engine data☆122Updated 9 years ago
- MITIE: library and tools for information extraction☆29Updated 10 years ago
- [UNMAINTAINED] Deploy, run and monitor your Scrapy spiders.☆11Updated 9 years ago
- General Architecture for Text Engineering☆46Updated 8 years ago
- Browser add-on and web server to support collection and analysis of web browsing data.☆13Updated 8 years ago
- A Topic Modeling toolbox☆92Updated 8 years ago
- An Exploration into Graph Databases☆28Updated 9 years ago
- Faceted search engine for domain-specific exploration of the Web☆45Updated 7 years ago
- Facet Search interface for MEMEX.☆13Updated 9 years ago
- Tools for scraping of twitter data, conversion, text analysis and graph construction☆11Updated 8 years ago
- A simple command line interface to the datamade/dedupe library.☆42Updated 2 years ago
- Demo code for learning_text_transformer☆25Updated 9 years ago
- Aperture-Tiles uses familiar web-based map interactions to allow exploration of arbitrary huge data sets.☆74Updated last year
- See https://github.com/tworavens/tworavens for current repository for this project and http://2ra.vn for project pages.☆30Updated 6 years ago
- mltk - Moz Language Tool Kit☆12Updated 9 years ago
- For interacting with nutch via Python☆24Updated 2 weeks ago
- Scraper built with Scrapy.☆14Updated 5 months ago
- Tribe extracts a network from an email mbox and writes it to a graphml file for visualization and analysis.☆78Updated last year
- Temporal Anomaly Detector (TAD)☆15Updated 7 years ago
- Multidimensional data explorer and visualization tool.☆55Updated 7 years ago
- ArchiveKit manages data and documents during ETL processes, either on a local file system or on S3.☆15Updated 9 years ago
- common data interchange format for document processing pipelines that apply natural language processing tools to large streams of text☆34Updated 8 years ago
- A network graph exploration tool☆63Updated 2 years ago
- ImageCat is an Apache OODT RADIX application that uses Apache Solr, Apache Tika and Apache OODT to ingest 10s of millions of files (image…☆94Updated 6 years ago
- JavaScript based graph visualization library with emphasis on customization and modularity.☆13Updated 5 years ago
- Set of scripts to aid in the download of the GDELT data files from www.gdeltproject.org☆11Updated 10 years ago
- Data Server for Topic Models☆121Updated last year