Vectorizers for a range of different data types
☆103Oct 9, 2025Updated 5 months ago
Alternatives and similar repositories for vectorizers
Users that are interested in vectorizers are comparing it to the libraries listed below
Sorting:
- A visual labeling system implemented in Jupyter widgets.☆155Nov 13, 2024Updated last year
- ☆24Dec 11, 2021Updated 4 years ago
- Ensemble topic modelling with pLSA☆114Sep 30, 2021Updated 4 years ago
- Clustering for mixed-type data☆101Jul 29, 2024Updated last year
- ☆86Mar 10, 2026Updated last week
- Matrix tools for building and inspecting latent spaces☆27Aug 19, 2018Updated 7 years ago
- Explaining dimensionality results using SHAP values☆55Jan 5, 2026Updated 2 months ago
- Transform a corpus of text documents (any kind) into a map with different zoom levels and topics names to summarise sub corpus of similar…☆29Jan 1, 2024Updated 2 years ago
- Notebooks configured to be run with Binder, usually found on my blog.☆42Mar 25, 2023Updated 2 years ago
- A Python nearest neighbor descent for approximate nearest neighbors☆961Jan 8, 2026Updated 2 months ago
- density-based clustering for exploratory data analysis based on multi-parameter persistence☆42Jul 20, 2025Updated 8 months ago
- Semi-Supervised t-SNE using a Bayesian prior based on partial labelling☆42Aug 15, 2016Updated 9 years ago
- Hosting examples of interactive datamapplot output☆30Feb 13, 2026Updated last month
- Unsupervised Anomaly Detection via Deep Metric Learning with End-to-End Optimization☆12Mar 23, 2023Updated 2 years ago
- Gather module dependencies of source code☆13Jul 21, 2023Updated 2 years ago
- RAGSkeleton: A foundational, modular framework for building customizable Retrieval-Augmented Generation (RAG) systems across any domain.☆14Jun 24, 2025Updated 8 months ago
- Benchmark of common hash functions☆10Sep 15, 2019Updated 6 years ago
- apricot implements submodular optimization for the purpose of selecting subsets of massive data sets to train machine learning models qui…☆528Nov 17, 2025Updated 4 months ago
- A library for hypergraphs and hypergraph algorithms☆28Oct 30, 2015Updated 10 years ago
- Python wrapper for the DPMMSubClusterStreaming.jl Julia package.☆14Sep 9, 2022Updated 3 years ago
- A high performance implementation of HDBSCAN clustering.☆3,086Updated this week
- Creating beautiful plots of data maps☆985Mar 11, 2026Updated last week
- Fast Numba-enabled CPU and GPU computations of Earth Mover's (scipy.stats.wasserstein_distance) and Euclidean distances.☆18Jan 3, 2025Updated last year
- ☆15Dec 18, 2023Updated 2 years ago
- The GB99dms implicit solvent force field for proteins, plus scripts and data☆27Sep 19, 2025Updated 6 months ago
- The ntentional blog - a machine learning journey☆23Oct 20, 2022Updated 3 years ago
- A python documentation linter which checks that the docstring description matches the definition. Based on darglint by @terrencepreilly.☆24Apr 1, 2024Updated last year
- Utilities for working with videos☆13Jul 5, 2025Updated 8 months ago
- This tutorial accompanies the NSF-CBMS Conference and Software Day on Topological Methods in Machine Learning and Artificial Intelligence…☆21May 18, 2019Updated 6 years ago
- Uniform Manifold Approximation and Projection☆8,119Mar 10, 2026Updated last week
- ☆27Sep 10, 2025Updated 6 months ago
- Bag of, not words, but tricks!☆68Oct 31, 2023Updated 2 years ago
- Robust piecewise regression☆14Jan 23, 2026Updated 2 months ago
- Utility functions that I reuse across different projects☆14Jun 4, 2021Updated 4 years ago
- A clean and easy interface for performing nearest-neighbor lookups☆50Jan 13, 2020Updated 6 years ago
- ☆95Jul 4, 2025Updated 8 months ago
- Helpers for working with pymatgen structure graphs.☆12Feb 4, 2025Updated last year
- Minimum-distortion embedding with PyTorch☆581Feb 26, 2026Updated 3 weeks ago
- just a bunch of useful embeddings for scikit-learn pipelines☆523Feb 12, 2026Updated last month