saran9991 / llm-data-annotationLinks
Use Large Language Models like OpenAI's GPT-3.5 for data annotation and model enhancement. This framework combines human expertise with LLMs, employs Iterative Active Learning for continuous improvement, and integrates CleanLab (Confident Learning) to ensure high-quality datasets and better model performance
☆37Updated last year
Alternatives and similar repositories for llm-data-annotation
Users that are interested in llm-data-annotation are comparing it to the libraries listed below
Sorting:
- ☆366Updated last year
- A Simple but Powerful SOTA NER Model | Official Code For Label Supervised LLaMA Finetuning☆155Updated last year
- A simple recipe for training and inferencing Transformer architecture for Multi-Task Learning on custom datasets. You can find two approa…☆97Updated 3 years ago
- Benchmarking Large Language Models☆98Updated last month
- Retrieval-Augmented Generation-based Relation Extraction☆45Updated 3 weeks ago
- Calculate perplexity on a text with pre-trained language models. Support MLM (eg. DeBERTa), recurrent LM (eg. GPT3), and encoder-decoder …☆162Updated last month
- Text classification with Foundation Language Model LLaMA☆114Updated 2 years ago
- Instruct LLMs for flat and nested NER. Fine-tuning Llama and Mistral models for instruction named entity recognition. (Instruction NER)☆84Updated last year
- ☆78Updated 10 months ago
- [ACL 2022] LinkBERT: A Knowledgeable Language Model 😎 Pretrained with Document Links☆443Updated 3 years ago
- ☆91Updated 6 months ago
- ACL2023 - AlignScore, a metric for factual consistency evaluation.☆136Updated last year
- A multi-purpose toolkit for table-to-text generation: web interface, Python bindings, CLI commands.☆55Updated last year
- A Python Natural Language Processing Toolkit for Medical Text Generation☆81Updated 2 months ago
- Efficient Attention for Long Sequence Processing☆97Updated last year
- Collection of NLP model explanations and accompanying analysis tools☆144Updated 2 years ago
- ☆44Updated 2 years ago
- LexGLUE: A Benchmark Dataset for Legal Language Understanding in English☆214Updated 2 weeks ago
- A Framework for Textual Entailment based Zero Shot text classification☆152Updated last year
- Model zoo for topic models, neural topic models, contextual embeddings for topic models ...☆45Updated 2 years ago
- BERT classification model for processing texts longer than 512 tokens. Text is first divided into smaller chunks and after feeding them t…☆143Updated last year
- This is the code for our KILT leaderboard submissions (KGI + Re2G models).☆156Updated 3 months ago
- ☆45Updated 3 years ago
- GISTEmbed: Guided In-sample Selection of Training Negatives for Text Embeddings☆43Updated last year
- pyTorch implementation of Recurrence over BERT (RoBERT) based on this paper https://arxiv.org/abs/1910.10781 and comparison with pyTorch …☆82Updated 2 years ago
- Set of vectorizers that extract keyphrases with part-of-speech patterns from a collection of text documents and convert them into a docum…☆265Updated 9 months ago
- Repository for the paper "MultiNERD: A Multilingual, Multi-Genre and Fine-Grained Dataset for Named Entity Recognition (and Disambiguatio…☆44Updated last year
- auto icd coding with prompt☆49Updated last year
- SciFive: a text-text transformer model for biomedical literature☆96Updated last year
- code for the paper "Zero-Shot Text Classification with Self-Training" for EMNLP 2022☆50Updated 3 months ago