smeeklai / chrome-selenium-scrapy
Ready to run python 3.6.4 docker image with chrome, selenium and scrapy installed
☆11Updated 6 years ago
Related projects ⓘ
Alternatives and complementary repositories for chrome-selenium-scrapy
- ☆16Updated 8 years ago
- Brand disambiguator for tweets to differentiate e.g. Orange vs orange (brand vs foodstuff), using NLTK and scikit-learn☆57Updated 11 years ago
- Chatlytics is a data query and visualization platform for chat!☆13Updated 7 years ago
- A cookiecutter template for Apache Spark applications written in Scala☆10Updated 5 years ago
- A collection of datasets and databases☆24Updated 6 years ago
- Text similarity based on Word2Vec vectors.☆10Updated 7 years ago
- Load a linkedin network w/ python py2neo into a neo4j database, serve it via node.js, and display it w/ sigma.js☆29Updated 11 years ago
- This repo demonstrates how to load a sample Parquet formatted file from an AWS S3 Bucket. A python job will then be submitted to a Apach…☆19Updated 8 years ago
- How to do data science with Optimus, Spark and Python.☆18Updated 5 years ago
- Airflow code accompanying blog post.☆21Updated 5 years ago
- This repository contains the code of the assistant used to demonstrate the migration from DialogFlow to Rasa☆11Updated 5 years ago
- Example code for building your own MemSQL Streamliner Pipelines☆23Updated 7 years ago
- Simple FieldCache based query introspection Solr Search Component - solves the 'red sofa' problem☆12Updated 3 years ago
- gzipstream allows Python to process multi-part gzip files from a streaming source☆23Updated 7 years ago
- Cloud Spanner Connector for Apache Spark☆17Updated last month
- spark-emr☆14Updated 10 years ago
- VoltDB Click Stream Processing Example.☆16Updated 6 years ago
- Feet is a tool for extracting entities from a text according to dictionaries.☆11Updated 7 years ago
- Small Docker image with Python Machine Learning tools (~180MB) https://hub.docker.com/r/frolvlad/alpine-python-machinelearning/☆79Updated 11 months ago
- A Pythonic API for Amazon's States Language for defining AWS Step Functions☆8Updated last year
- Tool for removing duplicate documents from Elasticsearch☆54Updated last year
- Telecom scenarios implemented with streaming techniques☆11Updated last year
- code and slides for my PyGotham 2016 talk, "Higher-level Natural Language Processing with textacy"☆15Updated 8 years ago
- ☆8Updated 6 years ago
- Traptor -- A distributed Twitter feed☆26Updated 2 years ago
- Docker container to make running Luigi tasks real easy.☆11Updated 8 years ago
- Parse wikipedia dumps and index (some) page data to elasticsearch☆49Updated 9 years ago
- Small set of utilities to simplify writing Scrapy spiders.☆49Updated 9 years ago