saran9991 / llm-data-annotation
Use Large Language Models like OpenAI's GPT-3.5 for data annotation and model enhancement. This framework combines human expertise with LLMs, employs Iterative Active Learning for continuous improvement, and integrates CleanLab (Confident Learning) to ensure high-quality datasets and better model performance
☆32Updated last year
Alternatives and similar repositories for llm-data-annotation:
Users that are interested in llm-data-annotation are comparing it to the libraries listed below
- ☆85Updated 5 months ago
- ☆42Updated last year
- Token-level Reference-free Hallucination Detection☆93Updated last year
- Efficient Attention for Long Sequence Processing☆91Updated last year
- Interpreting Language Models with Contrastive Explanations (EMNLP 2022 Best Paper Honorable Mention)☆60Updated 2 years ago
- A simple recipe for training and inferencing Transformer architecture for Multi-Task Learning on custom datasets. You can find two approa…☆91Updated 2 years ago
- Calculate perplexity on a text with pre-trained language models. Support MLM (eg. DeBERTa), recurrent LM (eg. GPT3), and encoder-decoder …☆147Updated 3 months ago
- Instruct LLMs for flat and nested NER. Fine-tuning Llama and Mistral models for instruction named entity recognition. (Instruction NER)☆80Updated 8 months ago
- ☆39Updated last year
- ☆37Updated 6 months ago
- Fact-Checking the Output of Generative Large Language Models in both Annotation and Evaluation.☆83Updated last year
- Detect hallucinated tokens for conditional sequence generation.☆64Updated 2 years ago
- A multi-purpose toolkit for table-to-text generation: web interface, Python bindings, CLI commands.☆55Updated 8 months ago
- [Data + code] ExpertQA : Expert-Curated Questions and Attributed Answers☆124Updated 10 months ago
- ☆67Updated 3 months ago
- ☆43Updated 2 years ago
- ☆40Updated 8 months ago
- A curated list of research papers and resources on Cultural LLM.☆32Updated 3 months ago
- Benchmarking Large Language Models☆81Updated 3 months ago
- Multilingual Large Language Models Evaluation Benchmark☆115Updated 4 months ago
- In this implementation, using the Flan T5 large language model, we performed the Text Classification task on the IMDB dataset and obtaine…☆21Updated last year
- Long Document Summarization Papers☆140Updated last year
- pyTorch implementation of Recurrence over BERT (RoBERT) based on this paper https://arxiv.org/abs/1910.10781 and comparison with pyTorch …☆80Updated 2 years ago
- Train Llama 2 & 3 on the SQuAD v2 task as an example of how to specialize a generalized (foundation) model.☆50Updated 7 months ago
- Define Transformers, T5 model and RoBERTa Encoder decoder model for product names generation☆48Updated 3 years ago
- This repository contains the relevant materials for the tutorial "Legal IR and NLP: the History, Challenges, and State-of-the-Art", held …☆40Updated last year
- Research code for "What to Pre-Train on? Efficient Intermediate Task Selection", EMNLP 2021☆34Updated 3 years ago
- ☆153Updated 7 months ago
- ☆23Updated 5 months ago
- Code and resources for our EMNLP-Findings'23 paper "DelucionQA: Detecting Hallucinations in Domain-specific Question Answering"☆1Updated 8 months ago