mvallebr / CSVInputFormatLinks
Input format for hadoop able to read multiline CSVs
☆34Updated last year
Alternatives and similar repositories for CSVInputFormat
Users that are interested in CSVInputFormat are comparing it to the libraries listed below
Sorting:
- Code to index Hive tables to Solr and Solr indexes to Hive☆46Updated 6 years ago
- Spark RDD with Lucene's query and entity linkage capabilities☆128Updated 5 months ago
- Kite SDK Examples☆99Updated 4 years ago
- Hive SerDe for CSV☆140Updated 4 years ago
- ☆68Updated 9 years ago
- A library you can include in your Spark job to validate the counters and perform operations on success. Goal is scala/java/python support…☆108Updated 8 years ago
- command line tool for Apache Lucene☆164Updated 3 weeks ago
- ⛔️ [DEPRECATED] sbt's scala incremental compiler☆303Updated 8 years ago
- Secondary sort and streaming reduce for Apache Spark☆78Updated 2 years ago
- Tools for reading data from Solr as a Spark RDD and indexing objects from Spark into Solr using SolrJ.☆445Updated 5 months ago
- functionstest☆33Updated 9 years ago
- A super simple utility for testing Apache Hive scripts locally for non-Java developers.☆73Updated 9 years ago
- Example Spark project using Parquet as a columnar store with Thrift objects.☆48Updated 11 years ago
- ☆92Updated 8 years ago
- Scripts for parsing / making sense of yarn logs☆52Updated 9 years ago
- An efficient updatable key-value store for Apache Spark☆254Updated 8 years ago
- Programming MapReduce with Scalding☆82Updated 10 years ago
- Use Cascading Taps and Scalding DSL with Spark☆49Updated 9 years ago
- Mirror of Apache Blur☆33Updated 7 years ago
- Mirror of Apache Lens☆62Updated 6 years ago
- REST job server for Spark. Note that this is *not* the mainline open source version. For that, go to https://github.com/spark-jobserver…☆345Updated 8 years ago
- Library for organizing batch processing pipelines in Apache Spark☆42Updated 9 years ago
- Apache Spark applications☆70Updated 8 years ago
- Drizzle integration with Apache Spark☆120Updated 7 years ago
- Chalk is a natural language processing library.☆260Updated 9 years ago
- Interactive Audience Analytics with Spark and HyperLogLog☆55Updated 10 years ago
- Online Java to Scala converter☆217Updated 7 years ago
- Scalable query engine for web scrapping/data mashup/acceptance QA, powered by Apache Spark☆142Updated last month
- A collection of Scala artifacts that make working with ZooKeeper enjoyable☆79Updated last year
- Support Highcharts in Apache Zeppelin☆81Updated 8 years ago