Use Large Language Models like OpenAI's GPT-3.5 for data annotation and model enhancement. This framework combines human expertise with LLMs, employs Iterative Active Learning for continuous improvement, and integrates CleanLab (Confident Learning) to ensure high-quality datasets and better model performance
☆40Sep 11, 2023Updated 2 years ago
Alternatives and similar repositories for llm-data-annotation
Users that are interested in llm-data-annotation are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Introducing gpt_annotate: an easy-to-use python package designed to streamline automated text annotation using LLMs for different tasks a…☆31Sep 7, 2024Updated last year
- [EMNLP 2022] Code for our paper “ZeroGen: Efficient Zero-shot Learning via Dataset Generation”.☆16Feb 18, 2022Updated 4 years ago
- Image clustering☆13Jan 22, 2022Updated 4 years ago
- 2019 FlyAI 中文微博的立场检测☆10Feb 19, 2020Updated 6 years ago
- The code for paper "ProQA: Structural Prompt-based Pre-training for Unified Question Answering"☆11Feb 7, 2023Updated 3 years ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- Resources for: Cross-Lingual Disaster-related Multi-label Tweet Classification with Manifold Mixup (ACL SRW 2020)☆11Sep 9, 2021Updated 4 years ago
- ☆13Nov 7, 2023Updated 2 years ago
- Ollama Modelfiles - Discover more at OllamaHub☆21Dec 2, 2023Updated 2 years ago
- The official repository for paper "LLMaAA: Making Large Language Models as Active Annotators"☆44Apr 14, 2024Updated last year
- ☆12Dec 26, 2023Updated 2 years ago
- Powerful document clustering models are essential as they can efficiently process large sets of documents. These models can be helpful in…☆17Oct 30, 2022Updated 3 years ago
- Calibre Plugin to download metadata and covers from DNB (Deutsche Nationalbibliothek)☆12Feb 9, 2026Updated last month
- Official implementation for "MM-Eval: A Multilingual Meta-Evaluation Benchmark for LLM-as-a-Judge and Reward Models"☆18Oct 26, 2024Updated last year
- Podcast index database quality dashboard☆15Mar 15, 2026Updated 2 weeks ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Early Detection of Fake News with Multi-source Weak Social Supervision☆24Jun 12, 2023Updated 2 years ago
- Pytorch implementation of "Enhancing Chinese Pre-trained Language Model via Heterogeneous Linguistics Graph", ACL 2022☆15Feb 28, 2022Updated 4 years ago
- ☆12May 10, 2024Updated last year
- ☆648Jul 29, 2025Updated 8 months ago
- Contains dataset and source code for the thesis "Generative AI for Business Process Management - Suitability of Modalities". Aims to eval…☆18Mar 7, 2025Updated last year
- Repo for paper: Controllable Text Generation with Language Constraints☆20Jun 20, 2023Updated 2 years ago
- ☆18Apr 2, 2021Updated 4 years ago
- FrugalScore is an approach to learn a fixed, low cost version of any expensive NLG metric, while retaining most of its original performan…☆16Sep 21, 2022Updated 3 years ago
- Implementation of Poincare Embedding in PyTorch☆13Jul 27, 2017Updated 8 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- Code for the paper: ConDA: Contrastive Domain Adaptation for AI-generated Text Detection☆41Dec 21, 2023Updated 2 years ago
- A client for distributed financial news webscraping.☆14Mar 1, 2021Updated 5 years ago
- Framework for controlling demographic biases in NLG (using adversarial prompts)☆21Jun 12, 2023Updated 2 years ago
- Code and data for "Detecting Stance in Media on Global Warming".☆15Dec 8, 2022Updated 3 years ago
- Official codes for paper "TimeDP: Learning to Generate Multi-Domain Time Series with Domain Prompts" (AAAI-25)☆16May 23, 2025Updated 10 months ago
- Open rotating mechanical fault datasets (开源旋转机械故障数据集整理)☆22Aug 10, 2020Updated 5 years ago
- ☆13Sep 13, 2015Updated 10 years ago
- Using Llama2 with Haystack, the NLP/LLM framework.☆16Jul 21, 2023Updated 2 years ago
- YOLOv5 Object Detection on Traffic Signs Dataset with Custom Training☆12Jan 19, 2022Updated 4 years ago
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- ☆22Dec 12, 2024Updated last year
- A supplementary material to "The Evolution of Work in the United States"☆12Jun 23, 2021Updated 4 years ago
- start exploring.☆16Apr 6, 2024Updated last year
- This repo is for the Mis2-KDD 2021 under review paper: Dataset of Propaganda Techniques of the State-Sponsored Information Operation of t…☆19Feb 5, 2022Updated 4 years ago
- ☆19Apr 28, 2021Updated 4 years ago
- Code for the AAAI 2023 Paper "Real or Fake Text?: Investigating Human Ability to Detect Boundaries Between Human-Written and Machine-Gene…☆17Oct 29, 2024Updated last year
- Data and codes for BioBERT-MRC☆11Oct 5, 2021Updated 4 years ago