scalingexcellence / scrapy-solrLinks
Scrapy pipeline which allows you to store scrapy items in a solr server.
☆19Updated 9 years ago
Alternatives and similar repositories for scrapy-solr
Users that are interested in scrapy-solr are comparing it to the libraries listed below
Sorting:
- Pure python script that takes user query and summarizes news related to it.☆25Updated 2 years ago
- Automatic tagging and analysis of documents in an Apache Solr index for faceted search by RDF(S) Ontologies & SKOS thesauri☆47Updated 3 years ago
- This is a REST Server endpoint built using Flask and Python.☆24Updated 2 years ago
- Site Hound (previously THH) is a Domain Discovery Tool☆23Updated 4 years ago
- Small set of utilities to simplify writing Scrapy spiders.☆49Updated 9 years ago
- An online sentiment analyzer built with Flask and TextBlob☆15Updated 11 years ago
- Library designed to replace the SQLite backend by a MongoDB backend on Scrapy queue management☆17Updated 7 years ago
- Search engine base (crawler, indexer and parser) using Python, Celery, RabbitMQ, CouchDB and Whoosh.☆11Updated 2 weeks ago
- Find which links on a web page are pagination links☆29Updated 8 years ago
- extract difference between two html pages☆32Updated 7 years ago
- Scraper built with Scrapy.☆18Updated 10 months ago
- Virtual patent marking crawler at iproduct.epfl.ch☆14Updated 7 years ago
- A scrapy pipeline which send items to Elastic Search server☆98Updated 7 years ago
- ☆43Updated 9 years ago
- Keyword query search engine on semantic store/linked data web☆9Updated 9 years ago
- Small demo for a "search-as-you-type" app in AngularJS + Python/Flask + Elasticsearch☆69Updated 7 years ago
- Python package to detect and return RSS / Atom feeds for a given website. The tool supports major blogging platform including Wordpress, …☆21Updated 3 years ago
- A semantic analysis tool to generate synonym.txt files for Solr. [RETIRED]☆24Updated 8 years ago
- A library to extract a publication date from a web page, along with a measure of the accuracy.☆41Updated 5 years ago
- Demo of the Newspaper article extraction library.☆29Updated 10 years ago
- Entity linker for the newspaper collection of the National Library of the Netherlands. Links named entity mentions to DBpedia description…☆11Updated 2 years ago
- common data interchange format for document processing pipelines that apply natural language processing tools to large streams of text☆35Updated 8 years ago
- A component that tries to avoid downloading duplicate content☆27Updated 7 years ago
- Extensions for using Scrapy on Amazon AWS☆32Updated 12 years ago
- Word analysis, by domain, on the Common Crawl data set for the purpose of finding industry trends☆57Updated last year
- Paginating the web☆37Updated 11 years ago
- Python/Django based webapps and web user interfaces for search, structure (meta data management like thesaurus, ontologies, annotations a…☆99Updated 2 years ago
- Extract synonyms, keywords from sentences using modified implementation of Aho Corasick algorithm☆40Updated 7 years ago
- Plots various graphs for a series of plaintext files in a directory☆19Updated 9 years ago
- API - extract a list of keywords from a text.☆18Updated 7 years ago