An attempt at creating a gold standard dataset for backtesting yesterday & today's content-extractors
☆35Mar 19, 2015Updated 11 years ago
Alternatives and similar repositories for crawl-to-the-future
Users that are interested in crawl-to-the-future are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Analysis related to article on FOIA Online Database.☆11Feb 2, 2017Updated 9 years ago
- get facebook data☆10Sep 14, 2014Updated 11 years ago
- Investigative tool for extracting relevant areas from many documents☆14Nov 17, 2015Updated 10 years ago
- Fast structured perceptron sequential labeler☆15Dec 8, 2015Updated 10 years ago
- Migrating to https://github.com/origamitower/folktale☆20Sep 6, 2016Updated 9 years ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- ☆16Jun 7, 2018Updated 7 years ago
- DEPRECATED: Use ghc-heap, ghc-heap-view in GHC 8.x instead.☆18Sep 17, 2016Updated 9 years ago
- Replication files for the March 2, 2015 Barron's story "The Little Guy Wins!," measuring market makers' trade execution quality.☆13Mar 12, 2015Updated 11 years ago
- Adds read support for Excel files (xls and xlsx) to agate.☆18Mar 27, 2026Updated last month
- LEMS interpreter implemented in Python☆12Nov 26, 2025Updated 5 months ago
- Extract data from websites using basic statistical magic☆506Oct 2, 2020Updated 5 years ago
- The missing datasets manager. Like hombrew but for datasets. CLI-tool for search and discover datasets!☆41May 29, 2017Updated 8 years ago
- Obtained in December 2014 through a Freedom of Information request☆15Jan 29, 2016Updated 10 years ago
- [UNMAINTAINED] A hypermedia REST HTTP API library for Clojure☆76Jul 12, 2015Updated 10 years ago
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- Ensure that a stream disconnects if it goes over `maxBytes` `perSeconds`☆13Apr 27, 2020Updated 6 years ago
- Parse live video and extract Chyron text☆20Aug 17, 2017Updated 8 years ago
- MIDI Controller for Panoramical☆18Apr 7, 2014Updated 12 years ago
- Python's missing statistical Swiss Army knife☆15Aug 25, 2015Updated 10 years ago
- A distributed in-memory fabric based on shared-memory blocks and datashape. Any language can operate on the data.☆13Feb 12, 2016Updated 10 years ago
- pythonic processes☆12Jun 12, 2015Updated 10 years ago
- A how-to do a mass collection of FEC data using the command-line and regular expressions☆29Mar 18, 2016Updated 10 years ago
- Manage and load dataprotocols.org Data Packages☆27Sep 17, 2015Updated 10 years ago
- Tweets annotated with coarse-grained sense labels (supersenses)☆13Jun 13, 2014Updated 11 years ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Links parts of input text to Wikipedia articles☆16Sep 9, 2012Updated 13 years ago
- A clone of the windows snipping tool for imgur!☆12Apr 1, 2014Updated 12 years ago
- a relational algebra library for JavaScript☆60Apr 15, 2026Updated 3 weeks ago
- A Flask+Elasticsearch UI for exploring the DC Inbox dataset from http://web.stevens.edu/dcinbox/Home.html☆16Jan 21, 2022Updated 4 years ago
- A semantic web crawler☆20Sep 20, 2010Updated 15 years ago
- A skeleton Django project☆94Jan 21, 2022Updated 4 years ago
- System for mining Wikipedia Usage data to read our collective mind☆20Sep 28, 2014Updated 11 years ago
- Karma Framework for running performance tasks using Telemetry☆37May 29, 2019Updated 6 years ago
- Content-based Recommendation Generator☆13Jan 21, 2015Updated 11 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- connect middleware that causes chaos☆26Feb 26, 2015Updated 11 years ago
- A dataset of popular pages (taken from <dir.yahoo.com>) with manually marked up semantic blocks.☆15Feb 9, 2014Updated 12 years ago
- Age classification from text using PAN16, blogs, Fisher Callhome, and Cancer Forum☆18Jul 1, 2022Updated 3 years ago
- mltk - Moz Language Tool Kit☆12Mar 6, 2015Updated 11 years ago
- Mange Python with Boxen and pyenv☆20Nov 9, 2017Updated 8 years ago
- Data science tools from Moz☆23Jan 11, 2017Updated 9 years ago
- Data for our analysis of Amtrak 188 derailment.☆10May 14, 2015Updated 10 years ago