Large-scale topic discovery with Sampled-MinHashing
☆10Jul 3, 2019Updated 6 years ago
Alternatives and similar repositories for SMH-Topic-Discovery
Users that are interested in SMH-Topic-Discovery are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A method to mine beyond-pairwise relationships using Min-Hashing for large-scale pattern discovery☆28Oct 10, 2021Updated 4 years ago
- Some convenient natural language tools that build on NLTK.☆85Jun 27, 2014Updated 11 years ago
- Protocol for finding informative protein families and then using them to score omic samples☆18Jul 27, 2022Updated 3 years ago
- Protocol for finding informative protein families and then using them to score metagenomic sets.☆10Oct 11, 2021Updated 4 years ago
- ☆14Jun 9, 2019Updated 6 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- ☆12Sep 22, 2015Updated 10 years ago
- UBC MDS blog and academic site☆16Mar 10, 2026Updated 2 weeks ago
- Extract statistics from Wikipedia Dump files.☆26Aug 2, 2021Updated 4 years ago
- Exploration of the U.S. rulesets as a network☆15May 20, 2022Updated 3 years ago
- Jupyter notebooks for the code samples of the book "Deep Learning with Python"☆10Jan 18, 2018Updated 8 years ago
- ☆13Feb 20, 2020Updated 6 years ago
- CrossRE: A Cross-Domain Dataset for Relation Extraction (Findings of EMNLP 2022)☆50Aug 20, 2024Updated last year
- Repository for R package MetQy (read related publication here: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6247936/)☆18Jan 21, 2022Updated 4 years ago
- A Hadoop toolkit for web-scale information retrieval research☆85Dec 12, 2014Updated 11 years ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Fast and customizable tokenization☆67Jul 9, 2019Updated 6 years ago
- Class frequency estimation software package☆13Sep 1, 2019Updated 6 years ago
- Data manipulations in Julia☆17Feb 22, 2018Updated 8 years ago
- Implementation of the paper "Fair Clustering Through Fairlets" by Chierichetti et al. (NIPS 2017)☆11Nov 29, 2019Updated 6 years ago
- Python module to remove wiki markup text.☆10Jan 15, 2016Updated 10 years ago
- Non-Metric Space Library (NMSLIB): An efficient similarity search library and a toolkit for evaluation of k-NN methods for generic non-me…☆12Dec 25, 2025Updated 3 months ago
- create a browser of a corpus using a topic model; original TMVE implementation (static pages)☆46Jun 29, 2015Updated 10 years ago
- Using Siamese LSTM to classify repeated quora questions. Attempted pretrained bert embeddings, Word2Vec and training own embeddings toget…☆10Aug 28, 2020Updated 5 years ago
- Production Ready Docker Container for TensorFlow Serving☆17Sep 11, 2017Updated 8 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- PhyloAcc a software to detect the changes of conservation of a genomic region☆35Feb 4, 2026Updated last month
- chart for comparing the corona epidemic between different countries☆15Aug 25, 2023Updated 2 years ago
- AST factorization: transformation AST of Kotlin source code to a vector☆11Oct 17, 2019Updated 6 years ago
- PyTorch library for synthesizing programs from natural language☆18Jul 25, 2024Updated last year
- JotForm API - Python Client☆55Feb 23, 2024Updated 2 years ago
- An alternative front end for Amazon Mechanical Turk☆12May 13, 2024Updated last year
- Framework para corpus paralelos | Framework for parallel corpora☆20Mar 4, 2026Updated 3 weeks ago
- CoronaWhy Common Research and Data Infrastructure for COVID-19☆13Dec 2, 2020Updated 5 years ago
- 👩🤝🤖 A curated list of datasets for large language models (LLMs), RLHF and related resources (continually updated)☆24May 2, 2023Updated 2 years ago
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- Supporting code for Learning to Rank (LTR) presentation☆16Oct 11, 2018Updated 7 years ago
- Performs unique entity estimation corresponding to Chen, Shrivastava, Steorts (2018).☆15Feb 21, 2019Updated 7 years ago
- Drawing tree structures with SVG and JavaScript☆34Aug 2, 2015Updated 10 years ago
- NLP2API: Query Reformulation for Code Search using Crowdsourced Knowledge and Extra-Large Data Analytics.☆12Dec 31, 2020Updated 5 years ago
- Read Carl Zeiss image files (CZI).☆35Mar 17, 2026Updated last week
- ☆12Apr 5, 2017Updated 8 years ago
- Implementation query expansion in semantic meta-search engine. The resulting expansion system is called Wiki-MetaSemantik.☆11Feb 10, 2019Updated 7 years ago