will3216 / newspaper3k_lambda_templateLinks
Pre-built template for using newspaper3k on aws lambda
☆17Updated 2 years ago
Alternatives and similar repositories for newspaper3k_lambda_template
Users that are interested in newspaper3k_lambda_template are comparing it to the libraries listed below
Sorting:
- GraphiPy: Universal Social Data Extractor☆83Updated 2 years ago
- Simple dashboard for getting currently trending hashtags and topics on Twitter☆25Updated 2 years ago
- A helper library full of URL-related heuristics.☆70Updated last week
- CLI to extract article contents in bulk using Newspaper3k and multithreading.☆12Updated 7 years ago
- API for OpenSanctions with support for entity search and bulk matching of data collections. Supports Reconciliation API spec.☆97Updated this week
- LinkRun - Data Engineering project done in 3 weeks during the Insight fellowship☆39Updated 5 years ago
- A Python client for the People Data Labs API☆35Updated 2 weeks ago
- Index Common Crawl archives in tabular format☆122Updated last month
- An automated, programming-free web scraper for interactive sites☆111Updated 2 years ago
- AI based web-wrapper for web-content-extraction☆100Updated 2 years ago
- 🏗️ Create APIs from CSV files within seconds, using fastapi☆77Updated 4 years ago
- The Summarlight Chrome Extension highlights the most important parts of posts/stories/articles.☆26Updated 6 years ago
- Apify actor for extracting data about homes from Zillow.com using it's internal API.☆54Updated 3 years ago
- ETL of newspaper article keywords using Apache Airflow, Newspaper3k, Quilt T4 and AWS S3☆16Updated 5 months ago
- A simple Flask & React app to demonstrate how to generate text with OpenAI's GPT-2☆53Updated 2 years ago
- Extract city and country mentions from Text like GeoText without regex, but FlashText, a Aho-Corasick implementation.☆62Updated last week
- Airbnb Scraper: Advanced Airbnb Search using Scrapy☆204Updated 2 years ago
- This repository provides usage examples for the Python module Newspaper3k.☆148Updated last year
- Build a Flask web application to help users retrieve key restaurant information and feature-based reviews (generated by applying market-b…☆22Updated 5 years ago
- Word analysis, by domain, on the Common Crawl data set for the purpose of finding industry trends☆57Updated last year
- Generate product descriptions, blogs, ads and more using GPT architecture with a single request to TextCortex API a.k.a Hemingwai☆40Updated 2 years ago
- The Official NewsCatcher News API V2 SDK for Python☆20Updated 11 months ago
- ☆71Updated last year
- Tools for running OCR against files stored in S3☆119Updated 3 years ago
- Add website scraping abilities to Datasette☆64Updated 2 years ago
- ☆62Updated last year
- Techniques for Scraping the Web in Python☆26Updated 7 years ago
- A library to extract a publication date from a web page, along with a measure of the accuracy.☆41Updated 6 years ago
- Now included in rigour☆151Updated last week
- An open-source archive that gathers, saves, shares and analyzes news homepages☆144Updated last week