mvogiatzis / freq-countView external linksLinks
Lossy Counting and Sticky Sampling implementation for efficient frequency counts on data streams.
☆63May 21, 2016Updated 9 years ago
Alternatives and similar repositories for freq-count
Users that are interested in freq-count are comparing it to the libraries listed below
Sorting:
- Hot news detection based on wikipedia data!☆28Jul 20, 2015Updated 10 years ago
- A modern networking framework based on ucx for Java 19+☆27Nov 23, 2023Updated 2 years ago
- The code for the in memory data pipeline that was presented at Berlin Buzzwords 2015.☆10Jun 1, 2015Updated 10 years ago
- Data science repo to help others☆12Feb 10, 2016Updated 10 years ago
- Discriminant Projection Forest results, datasets, etc.☆44Nov 30, 2019Updated 6 years ago
- Automatically exported from code.google.com/p/jbirch☆12Sep 6, 2022Updated 3 years ago
- Active learning of GP hyperparameters following Garnett, et al., "Active Learning of Linear Embeddings for Gaussian Processes," (UAI 2014…☆16Aug 4, 2017Updated 8 years ago
- approximate streaming quantiles☆31Jun 15, 2014Updated 11 years ago
- ☆10Jul 17, 2015Updated 10 years ago
- ☆10Jun 26, 2023Updated 2 years ago
- FRED simulator and associated paper☆26Jan 15, 2016Updated 10 years ago
- PredictionIO word2vec engine template (Scala-based parallelized engine)☆12Apr 22, 2015Updated 10 years ago
- Omnivore Optimizer and Distributed CcT☆13Jun 17, 2016Updated 9 years ago
- Dependency and data pipeline management framework for Spark and Scala☆15Apr 8, 2017Updated 8 years ago
- Elasticsearch plugin for approximate K-nearest-neighbors on floating-point vectors. Extended version.☆13Aug 10, 2019Updated 6 years ago
- Entry for the Third Annual GitHub Data Challenge☆35Nov 24, 2014Updated 11 years ago
- ☆18Mar 14, 2017Updated 8 years ago
- Collects official SARS-CoV-2 infection statistics published by the city of Dresden.☆19Apr 26, 2023Updated 2 years ago
- Stack-based language similar to Factor.☆26Jun 14, 2018Updated 7 years ago
- Sequential model-based optimization with a `scipy.optimize` interface☆15Aug 3, 2017Updated 8 years ago
- PyTorch library for synthesizing programs from natural language☆18Jul 25, 2024Updated last year
- Document or binary file vectorization with Normalized Compression Distance in Python.☆17Oct 14, 2015Updated 10 years ago
- SMASH is a hardware-software cooperative mechanism that enables highly-efficient indexing and storage of sparse matrices. The key idea of…☆18May 17, 2020Updated 5 years ago
- SBT project showing shading a library with SBT assembly☆15Oct 4, 2018Updated 7 years ago
- Extracts synonyms for various terms, exploiting the redirects between terms in Wikipedia☆12Nov 2, 2018Updated 7 years ago
- spy on your random forests☆19Aug 20, 2020Updated 5 years ago
- ☆16Dec 14, 2015Updated 10 years ago
- A library that allows serialization of SciKit-Learn estimators into PMML☆72Sep 27, 2019Updated 6 years ago
- Extract statistics from Wikipedia Dump files.☆26Aug 2, 2021Updated 4 years ago
- DEPRECATED! Use https://github.com/h2oai/sparkling-water repository! H2O and Spark interoperability based on Tachyon.☆44Nov 25, 2014Updated 11 years ago
- Data Sketches for Apache Spark☆22Dec 22, 2022Updated 3 years ago
- Will store links to known evaluation datasets alongside stats to characterize them☆24Mar 9, 2016Updated 9 years ago
- ☆20Mar 3, 2016Updated 9 years ago
- Implementation of "A Parallel Spatial Co-location Mining Algorithm Based on MapReduce" paper☆49Sep 23, 2017Updated 8 years ago
- Wikipedia Live Monitor☆22Dec 21, 2024Updated last year
- A Neural network implementation with Scala☆20Jul 17, 2016Updated 9 years ago
- A collective communication library plugined into Hadoop☆23Apr 12, 2022Updated 3 years ago
- Simplified Interface to Complex Memory☆28Aug 31, 2023Updated 2 years ago
- Distributed Streaming Quantiles (for PySpark)☆38Jan 30, 2014Updated 12 years ago