saran9991 / llm-data-annotation
Use Large Language Models like OpenAI's GPT-3.5 for data annotation and model enhancement. This framework combines human expertise with LLMs, employs Iterative Active Learning for continuous improvement, and integrates CleanLab (Confident Learning) to ensure high-quality datasets and better model performance
☆29Updated last year
Related projects ⓘ
Alternatives and complementary repositories for llm-data-annotation
- A Simple but Powerful SOTA NER Model | Official Code For Label Supervised LLaMA Finetuning☆141Updated 7 months ago
- ☆82Updated 3 months ago
- ☆333Updated 11 months ago
- Instruct LLMs for flat and nested NER. Fine-tuning Llama and Mistral models for instruction named entity recognition. (Instruction NER)☆78Updated 6 months ago
- Long Document Summarization Papers☆136Updated last year
- Multilingual Large Language Models Evaluation Benchmark☆105Updated 2 months ago
- ☆42Updated 2 years ago
- Repository for EMNLP 2022 Paper: Towards a Unified Multi-Dimensional Evaluator for Text Generation☆193Updated 9 months ago
- SemEval2024-task8: Multidomain, Multimodel and Multilingual Machine-Generated Text Detection☆70Updated 6 months ago
- ☆166Updated last year
- Token-level Reference-free Hallucination Detection☆92Updated last year
- ☆38Updated last year
- [Data + code] ExpertQA : Expert-Curated Questions and Attributed Answers☆122Updated 7 months ago
- Source Code of Paper "GPTScore: Evaluate as You Desire"☆230Updated last year
- Text classification with Foundation Language Model LLaMA☆110Updated last year
- SemEval2024-task 11: Bridging the Gap in Text-Based Emotion Detection☆24Updated this week
- Source code for paper "Learning from Noisy Labels for Entity-Centric Information Extraction", EMNLP 2021☆55Updated 2 years ago
- A Large-Scale Dataset for Empathetic Response Generation☆39Updated 6 months ago
- A Framework for Textual Entailment based Zero Shot text classification☆154Updated 7 months ago
- Code for Multilingual Eval of Generative AI paper published at EMNLP 2023☆65Updated 8 months ago
- ☆208Updated 8 months ago
- Transformer-based model for learning authorship representations.☆26Updated 3 months ago
- Codebase, data and models for the SummaC paper in TACL☆85Updated 10 months ago
- ☆93Updated 2 years ago
- Code and resources for our EMNLP-Findings'23 paper "DelucionQA: Detecting Hallucinations in Domain-specific Question Answering"☆2Updated 6 months ago
- A new collection of 1.7k doctor-patient conversations and corresponding clinical notes/summaries.☆54Updated last year
- ☆38Updated last year
- BERT classification model for processing texts longer than 512 tokens. Text is first divided into smaller chunks and after feeding them t…☆124Updated 4 months ago
- A Survey of Attributions for Large Language Models☆166Updated 2 months ago
- Unofficial implementation of paper "InstructionNER: A Multi-Task Instruction-Based Generative Framework for Few-shot NER" (https://arxiv.…☆37Updated 8 months ago