bloomberg / koan
A word2vec negative sampling implementation with correct CBOW update.
β260Updated 3 years ago
Alternatives and similar repositories for koan:
Users that are interested in koan are comparing it to the libraries listed below
- πΈ fastText + Bloom embeddings for compact, full-coverage vectors with spaCyβ299Updated last year
- Interactive Model Iteration with Weak Supervision and Pre-Trained Embeddingsβ76Updated 2 years ago
- Create interactive textual heat maps for Jupiter notebooksβ196Updated 7 months ago
- Self-Supervision for Named Entity Disambiguation at the Tailβ213Updated 2 years ago
- More interactive weak supervision with FlyingSquidβ315Updated 4 years ago
- Code for the Shortformer model, from the ACL 2021 paper by Ofir Press, Noah A. Smith and Mike Lewis.β145Updated 3 years ago
- Misspelling Oblivious Word Embeddingsβ203Updated 5 years ago
- SummVis is an interactive visualization tool for text summarization.β251Updated 2 years ago
- NeuralQA: A Usable Library for Question Answering on Large Datasets with BERTβ231Updated last year
- xfspell β the Transformer Spell Checkerβ188Updated 4 years ago
- π°Natural language processing (NLP) newsletterβ301Updated 4 years ago
- A library to synthesize text datasets using Large Language Models (LLM)β151Updated 2 years ago
- Deep learning with text doesn't have to be scary.β276Updated 2 years ago
- Robust and Fast tokenizations alignment library for Rust and Python https://tamuhey.github.io/tokenizations/β189Updated last year
- Labelling platform for text using weak supervision.β260Updated 2 years ago
- Accelerated NLP pipelines for fast inference on CPU. Built with Transformers and ONNX runtime.β126Updated 4 years ago
- SentAugment is a data augmentation technique for NLP that retrieves similar sentences from a large bank of sentences. It can be used in cβ¦β362Updated 2 years ago
- Question-answers, collected from Googleβ125Updated 3 years ago
- Tool for interactive embeddings visualizationβ302Updated 4 months ago
- Camphr - NLP libary for creating pipeline componentsβ341Updated 2 years ago
- SpikeX - SpaCy Pipes for Knowledge Extractionβ397Updated 3 years ago
- spaCy + UDPipeβ161Updated 2 years ago
- β489Updated 11 months ago
- Forte is a flexible and powerful ML workflow builder. This is part of the CASL project: http://casl-project.ai/β242Updated 11 months ago
- Live Python Notebooks with any Editorβ277Updated 2 years ago
- Super lightweight function registries for your libraryβ176Updated 7 months ago
- Docsβ143Updated last month
- Sentence transformers models for SpaCyβ107Updated last year
- LASER multilingual sentence embeddings as a pip packageβ224Updated last year
- PYthon Automated Term Extractionβ309Updated last year