saran9991 / llm-data-annotationLinks
Use Large Language Models like OpenAI's GPT-3.5 for data annotation and model enhancement. This framework combines human expertise with LLMs, employs Iterative Active Learning for continuous improvement, and integrates CleanLab (Confident Learning) to ensure high-quality datasets and better model performance
☆38Updated last year
Alternatives and similar repositories for llm-data-annotation
Users that are interested in llm-data-annotation are comparing it to the libraries listed below
Sorting:
- ☆76Updated 9 months ago
- Token-level Reference-free Hallucination Detection☆94Updated last year
- ☆91Updated 4 months ago
- Benchmarking Large Language Models☆99Updated last week
- ACL2023 - AlignScore, a metric for factual consistency evaluation.☆129Updated last year
- A Simple but Powerful SOTA NER Model | Official Code For Label Supervised LLaMA Finetuning☆155Updated last year
- [NAACL 2022] Robust (Controlled) Table-to-Text Generation with Structure-Aware Equivariance Learning.☆57Updated last year
- ☆44Updated 2 years ago
- Code repository for BEEP (Biomedical Evidence Enhanced Predictions) clinical outcome prediction system☆26Updated last year
- Instruct LLMs for flat and nested NER. Fine-tuning Llama and Mistral models for instruction named entity recognition. (Instruction NER)☆84Updated last year
- We evaluate many models used for biomedical and clinical nlp tasks, and train new models that perform much better.☆160Updated 3 years ago
- A collection of datasets for language model pretraining including scripts for downloading, preprocesssing, and sampling.☆59Updated 11 months ago
- The official repository for Efficient Long-Text Understanding Using Short-Text Models (Ivgi et al., 2022) paper☆69Updated 2 years ago
- ☆92Updated 3 years ago
- [Data + code] ExpertQA : Expert-Curated Questions and Attributed Answers☆131Updated last year
- SciFive: a text-text transformer model for biomedical literature☆95Updated last year
- Fact-Checking the Output of Generative Large Language Models in both Annotation and Evaluation.☆100Updated last year
- GISTEmbed: Guided In-sample Selection of Training Negatives for Text Embeddings☆43Updated last year
- A multi-purpose toolkit for table-to-text generation: web interface, Python bindings, CLI commands.☆55Updated last year
- pyTorch implementation of Recurrence over BERT (RoBERT) based on this paper https://arxiv.org/abs/1910.10781 and comparison with pyTorch …☆82Updated 2 years ago
- Efficient Attention for Long Sequence Processing☆94Updated last year
- Code and model checkpoints for the MultiVerS model for scientific claim verification.☆45Updated last year
- Bi-encoder entity linking architecture☆46Updated 9 months ago
- Fine-tuning of Flan-5T LLM for text classification 🤖 focuses on adapting a state-of-the-art language model to enhance its ability to cla…☆39Updated 8 months ago
- Repository for the paper "MultiNERD: A Multilingual, Multi-Genre and Fine-Grained Dataset for Named Entity Recognition (and Disambiguatio…☆44Updated last year
- ☆42Updated 3 years ago
- Tool for converting LLMs from uni-directional to bi-directional by removing causal mask for tasks like classification and sentence embedd…☆60Updated 6 months ago
- ☆39Updated 2 years ago
- Dataset from the paper "Mintaka: A Complex, Natural, and Multilingual Dataset for End-to-End Question Answering" (COLING 2022)☆114Updated 2 years ago
- Collection of NLP model explanations and accompanying analysis tools☆144Updated 2 years ago