charlesdedampierre / BunkaTopics
πΊοΈ Data Cleaning and Textual Data Visualization πΊοΈ
β146Updated 5 months ago
Related projects β
Alternatives and complementary repositories for BunkaTopics
- A BERT-based application for reusable text classification at scaleβ37Updated last year
- A spaCy wrapper for GliNERβ91Updated 4 months ago
- Notebooks for training universal 0-shot classifiers on many different tasksβ106Updated 7 months ago
- FastFit β‘ When LLMs are Unfit Use FastFit β‘ Fast and Effective Text Classification with Many Classesβ183Updated last month
- β68Updated 8 months ago
- Generalist and Lightweight Model for Relation Extraction (Extract any relationship types from text)β128Updated this week
- SpanMarker for Named Entity Recognitionβ401Updated 3 months ago
- Let's build better datasets, together!β206Updated this week
- This is the reproduction repository for my π€ Hugging Face blog post on synthetic dataβ61Updated 9 months ago
- awesome synthetic (text) datasetsβ242Updated 3 weeks ago
- π Process PDFs, Word documents and more with spaCyβ75Updated this week
- Retrieve, Read and LinK: Fast and Accurate Entity Linking and Relation Extraction on an Academic Budget (ACL 2024)β340Updated last month
- This repository contains an easy and intuitive approach to use SetFit in combination with spaCy.β72Updated last year
- Tools for interactive visual exploration of semantic embeddings.β29Updated 2 months ago
- In-Context Learning for eXtreme Multi-Label Classification (XMC) using only a handful of examples.β386Updated 9 months ago
- The Fastest State-of-the-Art Static Embeddings in the Worldβ473Updated this week
- Easily embed, cluster and semantically label text datasetsβ463Updated 7 months ago
- End-to-end zero-shot entity and relation extractionβ58Updated 3 months ago
- HDBSCAN Tuning for BERTopic Modelsβ42Updated last year
- β82Updated 6 months ago
- β333Updated 11 months ago
- SpaCyEx allows the creation of spaCy Matcher patterns with RegEx like syntax.β57Updated 6 months ago
- π« SpaCy wrapper for ConceptNet π«β88Updated last year
- Powerful topic model visualization in Pythonβ103Updated 2 months ago
- Evaluation of language models on mono- or multilingual tasks.β75Updated this week
- [EMNLP 2023 Demo] fabricator - annotating and generating datasets with large language models.β103Updated 6 months ago
- Streamlit Annotation Tools is a Streamlit component that gives you access to various annotation tools (labeling, highlighting, etc.) for β¦β79Updated 10 months ago
- Generalist and Lightweight Model for Text Classificationβ51Updated last week
- A collection of datasets for language model pretraining including scripts for downloading, preprocesssing, and sampling.β53Updated 3 months ago
- Late Interaction Models Training & Retrievalβ165Updated this week