keywords extraction
☆18Dec 15, 2015Updated 10 years ago
Alternatives and similar repositories for scala-tfidf
Users that are interested in scala-tfidf are comparing it to the libraries listed below
Sorting:
- Automatically exported from code.google.com/p/jbirch☆12Sep 6, 2022Updated 3 years ago
- Spark Implementation of BIRCH Clustering algorithm☆13Feb 18, 2020Updated 6 years ago
- Chalk is a natural language processing library.☆260Jan 30, 2017Updated 9 years ago
- Manual for RStudio Server☆16Oct 5, 2013Updated 12 years ago
- Simple UDF to split JSON arrays into Hive arrays☆10Jun 24, 2016Updated 9 years ago
- A Java implementation of LIBFFM: A Library for Field-aware Factorization Machines☆10Jan 4, 2022Updated 4 years ago
- Code for Springer Book: High Performance Distributed Computing: Case Studies with Hadoop, Scalding and Spark☆15Oct 6, 2017Updated 8 years ago
- BerkeleyX: CS100.1x, Introduction to Big Data with Apache Spark☆11Jul 27, 2015Updated 10 years ago
- memo & blog☆17Feb 8, 2015Updated 11 years ago
- Scikit-learn quickstart tutorial for Webstep☆19May 4, 2017Updated 8 years ago
- A Spark-based LexRank extractive summarizer for text documents☆19Dec 23, 2015Updated 10 years ago
- the database manager for Apache Hive☆21Jan 5, 2018Updated 8 years ago
- Chinese Word Segmention Base on the Deep Learning and LSTM Neural Network☆21Nov 22, 2016Updated 9 years ago
- Merge Small files for Hive Table on HDFS☆15Mar 4, 2014Updated 12 years ago
- Social Media Data Mining and Analytics - HyperLogLog, BloomFilter and CountMinSketch with Scalding & Algebird☆27Oct 6, 2018Updated 7 years ago
- A modular graph-based Retrieval-Augmented Generation (RAG) system☆13Jul 27, 2024Updated last year
- Hive UDF's for the data warehouse☆20May 7, 2018Updated 7 years ago
- k-means-visualization☆25May 22, 2014Updated 11 years ago
- A clion plugin that shows formatted macro expansion in the code documentation panel☆13Jan 22, 2020Updated 6 years ago
- 使用simhash算法,快速索引和查询大量文本简历☆21Dec 16, 2015Updated 10 years ago
- DEPRECATED! Use https://github.com/h2oai/sparkling-water repository! H2O and Spark interoperability based on Tachyon.☆44Nov 25, 2014Updated 11 years ago
- word2vec patch for Mac OS X☆25Jul 9, 2014Updated 11 years ago
- This repo contains the vagrant file and configuratios for setting up a three node (client, master and data node) elasticsearch(2.2.0 vers…☆18Feb 4, 2016Updated 10 years ago
- Augustus is an open source system for building and scoring statistical models designed to work with data sets that are too large to fit i…☆43Dec 19, 2013Updated 12 years ago
- ☆22Feb 5, 2025Updated last year
- Pro Django中文译本☆11Jun 11, 2017Updated 8 years ago
- emulates Maven's uniqueVersion snapshots☆24May 20, 2014Updated 11 years ago
- Use Cascading Taps and Scalding DSL with Spark☆49Dec 28, 2016Updated 9 years ago
- Java 8 Factorization Machines Library☆28Feb 17, 2017Updated 9 years ago
- HanLP Chinese Analysis Plugin for Elasticsearch http://www.elasticsearch.org☆19Aug 10, 2016Updated 9 years ago
- Assembly of fundamental statistics implemented based on Apache Spark☆31Feb 11, 2016Updated 10 years ago
- Easily use Google's reCAPTCHA within your Angular forms☆11Mar 27, 2018Updated 7 years ago
- An efficient native implementation of the HyperLogLog cardinality estimator for Ruby☆36Nov 16, 2012Updated 13 years ago
- A MCP (Model Context Protocol) server that provides automated GUI testing and control capabilities through PyAutoGUI.☆41Apr 2, 2025Updated 11 months ago
- A new language for optimization☆13May 17, 2021Updated 4 years ago
- A module for the Play Framework to build highly modular applications☆76Jan 31, 2022Updated 4 years ago
- Custom JupyterLab container for local-workstations and in-cluster Kubernetes Data Science, Machine Learning and IoT.☆12Aug 22, 2019Updated 6 years ago
- Using Istio Across Private and Public Clusters☆14Apr 20, 2019Updated 6 years ago
- A persistent history tree for undo/redo☆25Apr 11, 2021Updated 4 years ago