jamesmishra / mysqldump-to-csvLinks
A quickly-hacked-together Python script to turn mysqldump files to CSV files. Optimized for Wikipedia database dumps.
☆332Updated 2 years ago
Alternatives and similar repositories for mysqldump-to-csv
Users that are interested in mysqldump-to-csv are comparing it to the libraries listed below
Sorting:
- Parses log lines from an apache log☆258Updated last year
- command line tool to convert json to csv☆815Updated 2 years ago
- A command-line tool for using CommonCrawl Index API at http://index.commoncrawl.org/☆197Updated 6 years ago
- Automatically exported from code.google.com/p/chromium-compact-language-detector☆162Updated 4 years ago
- Index URLs in Common Crawl☆194Updated 7 years ago
- Converts JSON files to CSV (pulling data from nested structures). Useful for Mongo data☆263Updated 4 years ago
- Extract countries, regions and cities from a URL or text☆217Updated 4 years ago
- Command line tool for deduplicating CSV files☆428Updated 5 years ago
- Adaptive crawler which uses Reinforcement Learning methods☆169Updated 7 years ago
- Python script to load CSV to SQLite☆249Updated last year
- Automatically extracts and normalizes an online article or blog post publication date☆117Updated 2 years ago
- Randomly sample lines from a csv, tsv, or other line-based data file☆125Updated 10 years ago
- Load a CSV (or TSV) file into an Elasticsearch instance☆62Updated 2 years ago
- Carrot2 plugin for ElasticSearch☆291Updated 2 years ago
- Bulk indexing command line tool for elasticsearch.☆280Updated 5 months ago
- Python bindings to the Compact Language Detector☆33Updated 5 years ago
- A project to attempt to automatically login to a website given a single seed☆126Updated 3 years ago
- Language Detection with Infinity-gram☆230Updated 10 years ago
- Demonstration of using Python to process the Common Crawl dataset with the mrjob framework☆167Updated 3 years ago
- Analysis and visualization of email data☆143Updated 7 years ago
- Get gender from first name in python☆165Updated 7 years ago
- "Stop worrying about Elasticsearch analyzers", my therapist says☆154Updated 4 years ago
- Sentiment Classification using Word Sense Disambiguation☆170Updated 3 years ago
- ImageCat is an Apache OODT RADIX application that uses Apache Solr, Apache Tika and Apache OODT to ingest 10s of millions of files (image…☆96Updated 7 years ago
- WebAnnotator is a tool for annotating Web pages. WebAnnotator is implemented as a Firefox extension (https://addons.mozilla.org/en-US/fi…☆48Updated 3 years ago
- Tika-Similarity uses the Tika-Python package (Python port of Apache Tika) to compute file similarity based on Metadata features.☆108Updated 4 months ago
- Python scripts for processing XML documents and converting to SQL, CSV, and JSON [UNMAINTAINED]☆249Updated 4 months ago
- Extract place names from a URL or text, and add context to those names -- for example distinguishing between a country, region or city.☆62Updated 8 years ago
- Determine if a web comment is spam or not using naive Bayes. Trained on youtube comments.☆92Updated 13 years ago
- A python library detect and extract listing data from HTML page.☆108Updated 8 years ago