saran9991 / llm-data-annotation
Use Large Language Models like OpenAI's GPT-3.5 for data annotation and model enhancement. This framework combines human expertise with LLMs, employs Iterative Active Learning for continuous improvement, and integrates CleanLab (Confident Learning) to ensure high-quality datasets and better model performance
☆35Updated last year
Alternatives and similar repositories for llm-data-annotation:
Users that are interested in llm-data-annotation are comparing it to the libraries listed below
- ☆44Updated 2 years ago
- Benchmarking Large Language Models☆96Updated last month
- ☆39Updated last year
- ☆90Updated 2 months ago
- Multilingual Large Language Models Evaluation Benchmark☆123Updated 8 months ago
- Token-level Reference-free Hallucination Detection☆94Updated last year
- BERT classification model for processing texts longer than 512 tokens. Text is first divided into smaller chunks and after feeding them t…☆141Updated 10 months ago
- M4: Multi-generator, Multi-domain, and Multi-lingual Black-Box Machine-Generated Text Detection☆24Updated last year
- A simple recipe for training and inferencing Transformer architecture for Multi-Task Learning on custom datasets. You can find two approa…☆95Updated 2 years ago
- Text classification with Foundation Language Model LLaMA☆115Updated 2 years ago
- Code for Multilingual Eval of Generative AI paper published at EMNLP 2023☆68Updated last year
- A multi-purpose toolkit for table-to-text generation: web interface, Python bindings, CLI commands.☆55Updated 11 months ago
- Biomedical Data-to-Text Generation via Fine-Tuning Transformers☆28Updated 3 years ago
- SemEval2024-task 11: Bridging the Gap in Text-Based Emotion Detection☆41Updated last week
- Codebase, data and models for the SummaC paper in TACL☆91Updated 2 months ago
- Code repository for BEEP (Biomedical Evidence Enhanced Predictions) clinical outcome prediction system☆26Updated last year
- ☆34Updated 6 months ago
- Interpreting Language Models with Contrastive Explanations (EMNLP 2022 Best Paper Honorable Mention)☆62Updated 2 years ago
- Calculate perplexity on a text with pre-trained language models. Support MLM (eg. DeBERTa), recurrent LM (eg. GPT3), and encoder-decoder …☆155Updated 6 months ago
- ☆71Updated 7 months ago
- ☆41Updated 11 months ago
- Neighborhood Contrastive Learning for Scientific Document Representations with Citation Embeddings (EMNLP 2022 paper)☆67Updated 2 years ago
- ☆29Updated 5 months ago
- Code associated with the paper "Entropy-based Attention Regularization Frees Unintended Bias Mitigation from Lists"☆48Updated 2 years ago
- Collection of NLP model explanations and accompanying analysis tools☆145Updated last year
- Data set for LREC 2020 paper "I Feel Offended, Don't Be Abusive!"☆18Updated last year
- Official repo for SAC3: Reliable Hallucination Detection in Black-Box Language Models via Semantic-aware Cross-check Consistency☆35Updated 3 months ago
- auto icd coding with prompt☆49Updated 11 months ago
- DetectLLM: Leveraging Log Rank Information for Zero-Shot Detection of Machine-Generated Text☆29Updated last year
- A curated list of awesome datasets with human label variation (un-aggregated labels) in Natural Language Processing and Computer Vision, …☆82Updated last year