readikus / ramekin
An open source, real time trend detection library
☆9Updated 4 years ago
Related projects: ⓘ
- Text analysis for automatic bookmarking/keyword extraction☆18Updated 7 years ago
- Word analysis, by domain, on the Common Crawl data set for the purpose of finding industry trends☆57Updated 7 months ago
- This repository contains the Domain Discovery Tool (DDT) project. DDT is an interactive system that helps users explore and better unders…☆46Updated 2 years ago
- A search engine for Open Data☆52Updated last year
- Make it easier to compare and cross-reference the names of companies and people by applying strong normalisation.☆142Updated 7 months ago
- Virtual patent marking crawler at iproduct.epfl.ch☆14Updated 7 years ago
- Processes data from images which are tagged with the specified Instagram tag.☆13Updated 10 years ago
- A toolkit for clustering web pages based on various similarity measures.☆32Updated 2 years ago
- Python clients for Zyte AutoExtract API☆39Updated 2 years ago
- A python library detect and extract listing data from HTML page.☆109Updated 7 years ago
- A generic crawler☆78Updated 6 years ago
- GraphiPy: Universal Social Data Extractor☆79Updated last year
- Matrix-based News Aggregation to Explore Media Bias☆19Updated 6 years ago
- Adaptive crawler which uses Reinforcement Learning methods☆170Updated 6 years ago
- Extract social media links and account names from websites.☆36Updated 4 years ago
- Matches a category of Google's Taxonomy to product that is described in any kind of text data☆57Updated 6 years ago
- A library to extract a publication date from a web page, along with a measure of the accuracy.☆42Updated 5 years ago
- Zyte Automatic Extraction integration for Scrapy☆55Updated 2 years ago
- Parsing resumes in a PDF format from linkedIn☆65Updated 7 years ago
- LinkRun - Data Engineering project done in 3 weeks during the Insight fellowship☆37Updated 4 years ago
- Spin up Tor containers and then proxy HTTP requests via these Tor instances☆42Updated 3 years ago
- Analyze scraped data☆47Updated 4 years ago
- Console program to get global ranking for a given website or domain☆20Updated last year
- CoCrawler is a versatile web crawler built using modern tools and concurrency.☆183Updated 2 years ago
- Extracting addresses from text☆41Updated 6 years ago
- API - extract a list of keywords from a text.☆18Updated 7 years ago
- Source real estate prices from the Common Crawl.☆27Updated 5 years ago
- Pre-built Scrapy spiders for AutoExtract☆19Updated 4 months ago
- Record Linkage ToolKit (Find and link entities)☆105Updated last year
- A free, client-side web scraper that turns websites into structured data without having to use code.☆46Updated 8 years ago