jamesmishra / mysqldump-to-csvLinks
A quickly-hacked-together Python script to turn mysqldump files to CSV files. Optimized for Wikipedia database dumps.
☆336Updated 3 years ago
Alternatives and similar repositories for mysqldump-to-csv
Users that are interested in mysqldump-to-csv are comparing it to the libraries listed below
Sorting:
- Parses log lines from an apache log☆258Updated last year
- Language Detection with Infinity-gram☆230Updated 10 years ago
- Extract countries, regions and cities from a URL or text☆217Updated 5 years ago
- Converts JSON files to CSV (pulling data from nested structures). Useful for Mongo data☆263Updated 4 years ago
- Twitter text processing library (auto linking and extraction of usernames, lists and hashtags).☆178Updated last year
- "Stop worrying about Elasticsearch analyzers", my therapist says☆154Updated 4 years ago
- A Twitter search client mining tweets using their advanced search implemtation.☆90Updated 7 years ago
- Python interface to the Stanford Named Entity Recognizer☆294Updated 4 years ago
- Automatically exported from code.google.com/p/chromium-compact-language-detector☆161Updated 5 years ago
- Tools to work with the big reddit JSON data dump.☆256Updated last year
- Index URLs in Common Crawl☆198Updated 8 years ago
- Send summary messages of your Luigi jobs to Slack☆46Updated 6 years ago
- A command-line tool for using CommonCrawl Index API at http://index.commoncrawl.org/☆205Updated 7 years ago
- Sentiment Classification using Word Sense Disambiguation☆170Updated 3 years ago
- Export from an Elasticsearch into a CSV file☆511Updated 4 years ago
- Carrot2 plugin for ElasticSearch☆294Updated 3 years ago
- Text classification using Naive Bayes and Elasticsearch☆152Updated 9 years ago
- Randomly sample lines from a csv, tsv, or other line-based data file☆125Updated 10 years ago
- This is a mirror of the script by Giuseppe Attardi, and contains history before the official repo started: https://github.com/attardi/wik…☆259Updated 9 years ago
- Determine if a web comment is spam or not using naive Bayes. Trained on youtube comments.☆92Updated 13 years ago
- Trend detection algorithms for Twitter time series data☆194Updated 8 years ago
- A python tool for collecting tweets in mongoDB using the search API☆80Updated 2 years ago
- ☆98Updated 4 years ago
- A python library for simple text summarization☆219Updated 10 years ago
- A tool to segment text based on frequencies and the Viterbi algorithm "#TheBoyWhoLived" => ['#', 'The', 'Boy', 'Who', 'Lived']☆81Updated 9 years ago
- Elasticsearch Latent Semantic Indexing experimentation☆33Updated 6 years ago
- Refinery - A locally deployable open-source web platform for analysis of large document collections☆101Updated 9 years ago
- Git Support Utilities☆81Updated 3 years ago
- Sometimes sites make crawling hard. Selenium-crawler uses selenium automation to fix that.☆126Updated 12 years ago
- Public Machine Learning and Data Competition Repo☆54Updated 10 years ago