saran9991 / llm-data-annotation
Use Large Language Models like OpenAI's GPT-3.5 for data annotation and model enhancement. This framework combines human expertise with LLMs, employs Iterative Active Learning for continuous improvement, and integrates CleanLab (Confident Learning) to ensure high-quality datasets and better model performance
☆36Updated last year
Alternatives and similar repositories for llm-data-annotation
Users that are interested in llm-data-annotation are comparing it to the libraries listed below
Sorting:
- ☆90Updated 3 months ago
- Text classification with Foundation Language Model LLaMA☆115Updated 2 years ago
- Fine-tuning of Flan-5T LLM for text classification 🤖 focuses on adapting a state-of-the-art language model to enhance its ability to cla…☆39Updated 6 months ago
- GISTEmbed: Guided In-sample Selection of Training Negatives for Text Embeddings☆43Updated last year
- ☆44Updated 2 years ago
- A multi-purpose toolkit for table-to-text generation: web interface, Python bindings, CLI commands.☆55Updated last year
- 🔍 A statutory article retrieval dataset in French. (ACL 2022)☆39Updated last year
- A curated list of research papers and resources on Cultural LLM.☆43Updated 7 months ago
- Collection of NLP model explanations and accompanying analysis tools☆145Updated last year
- Benchmarking Large Language Models☆96Updated last month
- KeyPhraseTransformer lets you quickly extract key phrases, topics, themes from your text data with T5 transformer | Keyphrase extraction…☆104Updated 10 months ago
- Efficient Attention for Long Sequence Processing☆94Updated last year
- [Data + code] ExpertQA : Expert-Curated Questions and Attributed Answers☆128Updated last year
- A Simple but Powerful SOTA NER Model | Official Code For Label Supervised LLaMA Finetuning☆155Updated last year
- Instruct LLMs for flat and nested NER. Fine-tuning Llama and Mistral models for instruction named entity recognition. (Instruction NER)☆84Updated last year
- ☆71Updated 7 months ago
- A simple recipe for training and inferencing Transformer architecture for Multi-Task Learning on custom datasets. You can find two approa…☆96Updated 2 years ago
- Token-level Reference-free Hallucination Detection☆94Updated last year
- A Python Natural Language Processing Toolkit for Medical Text Generation☆79Updated last week
- ☆51Updated 4 years ago
- Dataset from the paper "Mintaka: A Complex, Natural, and Multilingual Dataset for End-to-End Question Answering" (COLING 2022)☆113Updated 2 years ago
- Dataset used to evaluate Skill Extraction systems based on the ESCO skills taxonomy.☆13Updated 10 months ago
- Repository for the paper "MultiNERD: A Multilingual, Multi-Genre and Fine-Grained Dataset for Named Entity Recognition (and Disambiguatio…☆44Updated last year
- ☆42Updated 2 years ago
- Neighborhood Contrastive Learning for Scientific Document Representations with Citation Embeddings (EMNLP 2022 paper)☆67Updated 2 years ago
- Set of vectorizers that extract keyphrases with part-of-speech patterns from a collection of text documents and convert them into a docum…☆262Updated 6 months ago
- Finetune mistral-7b-instruct for sentence embeddings☆80Updated last year
- ACL2023 - AlignScore, a metric for factual consistency evaluation.☆127Updated last year
- ☆161Updated 10 months ago
- Generalist and Lightweight Model for Text Classification☆128Updated 2 weeks ago