pmyteh/RISJbot

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/pmyteh/RISJbot)

pmyteh / RISJbot

A scrapy project to extract the text and metadata of articles from news websites

☆74

Alternatives and similar repositories for RISJbot

Users that are interested in RISJbot are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

arthurk / scrapy-german-news
View on GitHub
Scrapy project with spiders to extract article content from various german news sites
☆21Sep 13, 2013Updated 12 years ago
scrapy-plugins / scrapy-dotpersistence
View on GitHub
A scrapy extension to sync `.scrapy` folder to an S3 bucket
☆18Mar 28, 2022Updated 4 years ago
python-ruia / awesome-ruia
View on GitHub
A list of awesome project for Ruia
☆13Aug 24, 2022Updated 3 years ago
DrorWalt / ANTMN
View on GitHub
Supplementary code for "News Frame Analysis: An Inductive Mixed-method Computational Approach" http://dx.doi.org/10.1080/19312458.2019.16…
☆16Nov 13, 2020Updated 5 years ago
stummjr / HackerNewsDailyDigest
View on GitHub
A toy project with Scrapy + Django + Celery to run on Heroku
☆13Sep 8, 2015Updated 10 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
gonenhila / usage_change
View on GitHub
Code for the paper "Simple, Interpretable and Stable Method for Detecting Words with Usage Change across Corpora", ACL 2020.
☆18Aug 28, 2020Updated 5 years ago
jroakes / NodeRank
View on GitHub
Content Extraction using the PageRank algorithm to find the element containing the best content.
☆13Aug 14, 2019Updated 6 years ago
tommymarshall / hackernews
View on GitHub
Hackernews clone built with Backbone and Laravel
☆11Apr 3, 2014Updated 12 years ago
azavea / docker-django
View on GitHub
Base Docker image for Django and Gunicorn.
☆28May 9, 2023Updated 3 years ago
orangain / scrapy-s3pipeline
View on GitHub
Scrapy pipeline to store chunked items into Amazon S3 or Google Cloud Storage bucket.
☆76Mar 18, 2022Updated 4 years ago
Lexy0309 / Bustabit
View on GitHub
This is cryptocurrency gambling game project built in React JS and working on butstabit.com
☆14Sep 4, 2020Updated 5 years ago
mediacloud / api-tutorial-notebooks
View on GitHub
A set of jupyter notebooks demonstrating how to use the Media Cloud API.
☆47Jun 17, 2025Updated last year
frontyard / cordova-plugin-android-tv
View on GitHub
Cordova Android TV Plugin
☆19Jul 30, 2021Updated 4 years ago
cdrx / scrapyd-authenticated
View on GitHub
Docker container running scrapyd with HTTP authentication
☆41May 14, 2024Updated 2 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
Lyquix / ubuntu-lamp
View on GitHub
Bash scripts to automatically setup LAMP server following best practices
☆16Jul 8, 2026Updated 3 weeks ago
BedrockStreaming / roboxt
View on GitHub
DEPRECATED - simple parser for robots.txt
☆17Sep 16, 2019Updated 6 years ago
fili / screaming-frog-on-google-compute-engine
View on GitHub
Screaming Frog SEO Spider Install Script by Fili (SEO Expert & ex-Google engineer)
☆14Apr 12, 2021Updated 5 years ago
skyrocknroll / python-kafka-avro-example
View on GitHub
☆11Apr 9, 2017Updated 9 years ago
fboender / my_indexr
View on GitHub
A tool that outputs SQL commands for dropping and recreating indexes on MySQL databases / tables.
☆12Aug 10, 2016Updated 9 years ago
schmokel / FBAdLibrarian
View on GitHub
The FBAdLibrarian is a simple tool that can pull ad data and collects images offered by Facebook’s Ad Library API.
☆15Mar 10, 2023Updated 3 years ago
themeskult / urban-theme
View on GitHub
☆15May 4, 2014Updated 12 years ago
ibm-cloud-security / certificate-manager-domain-validation-cloud-function-sample
View on GitHub
☆11Jan 10, 2022Updated 4 years ago
MichaelKreil / twitter-analysis2
View on GitHub
☆12Apr 12, 2023Updated 3 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
gitdagray / tinypng_clone
View on GitHub
☆10Apr 10, 2021Updated 5 years ago
opener-project / coreference-base
View on GitHub
Co-reference resolution for the English language.
☆18Jan 12, 2015Updated 11 years ago
ndg63276 / alexa-googlemaps
View on GitHub
An Alexa skill to give directions from Google Maps
☆11Apr 2, 2021Updated 5 years ago
hplgit / vagrantbox
View on GitHub
Tutorial for Vagrant boxes and related technologies such as virtualenv.
☆12Sep 18, 2016Updated 9 years ago
JBGruber / paperboy
View on GitHub
A comprehensive (eventually) collection of webscraping scripts for news media sites
☆76Jul 2, 2026Updated 3 weeks ago
gdgbhu / gmail-php-starter
View on GitHub
GMAIL PHP Starter Project built with Twitter Bootstrap 3
☆17May 3, 2015Updated 11 years ago
KlaraKrieg / GrepBiasIR
View on GitHub
Information Retrieval Gender Bias Dataset
☆14Apr 21, 2023Updated 3 years ago
cschwem2er / stminsights
View on GitHub
A Shiny Application for Inspecting Structural Topic Models
☆121Jun 27, 2024Updated 2 years ago
vanatteveldt / rsyntax
View on GitHub
R library to help dealing with syntactic structure
☆38Feb 3, 2025Updated last year
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
KayneWest / DeepSpeech
View on GitHub
project trying to replicate http://arxiv.org/pdf/1412.5567v2.pdf
☆12Mar 22, 2015Updated 11 years ago
hemanth09 / creating-crud-app-with-apollo-graphql-node-mongodb-and-react
View on GitHub
Creating a simple CRUD app with NodeJS, MongoDB, GraphQL, React and Apollo
☆10Feb 1, 2020Updated 6 years ago
Joaoffg / ELM
View on GitHub
The Erasmian Language Model
☆14Jun 2, 2026Updated last month
johnbe4 / getSeoSitemap
View on GitHub
PHP library to get the sitemap. It crawls a whole website checking all internal and external links plus a Search Engine Optimization.
☆15Aug 29, 2024Updated last year
scrapinghub / scrapinghub-entrypoint-scrapy
View on GitHub
Scrapy entrypoint for Scrapinghub job runner
☆24Feb 26, 2026Updated 5 months ago
jaydio / routeros-scripts
View on GitHub
MikroTik RouterOS - Assorted Scripts
☆25Aug 10, 2022Updated 3 years ago
rhamerly / webmapper
View on GitHub
A Chrome extension that creates a personalized map of the web based on the user's browsing history.
☆26Mar 10, 2013Updated 13 years ago