readikus / ramekinLinks
An open source, real time trend detection library
☆9Updated 5 years ago
Alternatives and similar repositories for ramekin
Users that are interested in ramekin are comparing it to the libraries listed below
Sorting:
- GraphiPy: Universal Social Data Extractor☆83Updated 2 years ago
- Zyte Automatic Extraction integration for Scrapy☆56Updated 3 years ago
- Console program to get global ranking for a given website or domain☆21Updated 3 weeks ago
- Python library for feature selection for text features. It has filter method, genetic algorithm and TextFeatureSelectionEnsemble for impr…☆52Updated last year
- Virtual patent marking crawler at iproduct.epfl.ch☆14Updated 7 years ago
- Aviation grade news article metadata extraction☆36Updated 2 years ago
- Now included in rigour☆151Updated last month
- JedAI-WebApp is a GUI that facilitates the execution of JedAI. JedAI is an open source, high scalability toolkit that offers out-of-the-b…☆23Updated 2 years ago
- Word analysis, by domain, on the Common Crawl data set for the purpose of finding industry trends☆57Updated last year
- A library to extract a publication date from a web page, along with a measure of the accuracy.☆41Updated 5 years ago
- Paginating the web☆37Updated 11 years ago
- Record Linkage ToolKit (Find and link entities)☆110Updated last year
- Matches a category of Google's Taxonomy to product that is described in any kind of text data☆62Updated 6 years ago
- NSS Capstone project to use natural language modeling, classification, and information extraction to get the exact employee count values …☆15Updated 6 years ago
- Formasaurus tells you the type of an HTML form and its fields using machine learning☆118Updated last year
- 🚕 A spreadsheet-like data preparation web app that works over Optimus (Pandas, Dask, cuDF, Dask-cuDF, Spark and Vaex)☆141Updated last year
- advertools crawler UI☆28Updated 2 years ago
- Vector AI — A platform for building vector based applications. Encode, query and analyse data using vectors.☆315Updated last year
- Site Hound (previously THH) is a Domain Discovery Tool☆23Updated 4 years ago
- Automated Outlier Detection and Treatment Tool☆102Updated 2 years ago
- Extract social media links and account names from websites.☆38Updated 5 years ago
- General Architecture for Text Engineering☆50Updated 9 years ago
- This repository contains the Domain Discovery Tool (DDT) project. DDT is an interactive system that helps users explore and better unders…☆45Updated 3 years ago
- Scalable String Similarity Joins in Python☆39Updated 11 months ago
- API - extract a list of keywords from a text.☆18Updated 7 years ago
- T2K Match is a matching algorithm optimised to match millions of web tables to a central knowledge base.☆21Updated 7 years ago
- AI based web-wrapper for web-content-extraction☆100Updated 2 years ago
- Download DIG to run on your laptop or server.☆101Updated 6 years ago
- CoCrawler is a versatile web crawler built using modern tools and concurrency.☆191Updated 3 years ago
- Didactic Web crawler for Web Search Engines (CS 6913) course at NYU☆11Updated 2 years ago