rungalileo / dataquality
Python SDK for Galileo's NLP and CV Studio.
β17Updated this week
Alternatives and similar repositories for dataquality:
Users that are interested in dataquality are comparing it to the libraries listed below
- β22Updated 2 years ago
- π€ Disaggregators: Curated data labelers for in-depth analysis.β65Updated 2 years ago
- Template-based generation of DAG cards from Metaflow classes, inspired by Google cards for machine learning models.β30Updated 3 years ago
- classy is a simple-to-use library for building high-performance Machine Learning models in NLP.β86Updated 2 months ago
- XAI based human-in-the-loop framework for automatic rule-learning.β48Updated 8 months ago
- A diff tool for language modelsβ42Updated last year
- Just another sentiment wrapper.β17Updated 3 years ago
- β30Updated 3 years ago
- β13Updated 2 years ago
- A Python library aimed at dissecting and augmenting NER training data.β58Updated last year
- The Python library with command line tools to interact with Dynabench(https://dynabench.org/), such as uploading models.β55Updated 2 years ago
- A tool for quickly adding labels to unlabeled datasetsβ20Updated last year
- β43Updated last year
- An easy-to-use Python module that helps you to extract the BERT embeddings for a large text dataset (Bengali/English) efficiently.β36Updated last year
- SMASHED is a toolkit designed to apply transformations to samples in datasets, such as fields extraction, tokenization, prompting, batchiβ¦β32Updated 10 months ago
- Preprocessing and analysis for training SNOMED-CT concept embeddings from CORD-19 corpusβ14Updated last year
- Gzip and nearest neighbors for text classificationβ55Updated last year
- A Streamlit component for annotating text by text selecting.β40Updated 9 months ago
- A simple converter from SpaCy Entities (Spans) to Huggingface BILOU formatted data (tokens and ner_tags)β14Updated 5 months ago
- RATransformers π- Make your transformer (like BERT, RoBERTa, GPT-2 and T5) Relation Aware!β41Updated 2 years ago
- Hashformers is a framework for hashtag segmentation with Transformers and Large Language Models (LLMs).β70Updated 7 months ago
- β77Updated 2 years ago
- Ranking of fine-tuned HF models as base models.β35Updated last year
- Visualise, evaluate, and manage annotated dataβ33Updated 2 years ago
- Data Programming by Demonstration (DPBD) for Document Classificationβ35Updated 3 years ago
- Heterogenous, Task- and Domain-Specific Benchmark for Unsupervised Sentence Embeddings used in the TSDAE paper: https://arxiv.org/abs/210β¦β32Updated 3 years ago
- Repo for training MLMs, CLMs, or T5-type models on the OLM pretraining data, but it should work with any hugging face text dataset.β93Updated 2 years ago
- Retrieval Augmented Generation applicationsβ26Updated last year
- Streamlit demo app to demonstrate the features of transformers interpret with multiple models.β25Updated 3 years ago
- One-stop shop for running and fine-tuning transformer-based language models for retrievalβ50Updated last week