saran9991 / llm-data-annotation
Use Large Language Models like OpenAI's GPT-3.5 for data annotation and model enhancement. This framework combines human expertise with LLMs, employs Iterative Active Learning for continuous improvement, and integrates CleanLab (Confident Learning) to ensure high-quality datasets and better model performance
☆30Updated last year
Related projects ⓘ
Alternatives and complementary repositories for llm-data-annotation
- Multilingual Large Language Models Evaluation Benchmark☆107Updated 3 months ago
- [Data + code] ExpertQA : Expert-Curated Questions and Attributed Answers☆122Updated 8 months ago
- Long Document Summarization Papers☆137Updated last year
- A Simple but Powerful SOTA NER Model | Official Code For Label Supervised LLaMA Finetuning☆142Updated 8 months ago
- Codebase, data and models for the SummaC paper in TACL☆85Updated 11 months ago
- ☆147Updated 5 months ago
- Source Code of Paper "GPTScore: Evaluate as You Desire"☆231Updated last year
- GISTEmbed: Guided In-sample Selection of Training Negatives for Text Embeddings☆37Updated 8 months ago
- Github repository for "RAGTruth: A Hallucination Corpus for Developing Trustworthy Retrieval-Augmented Language Models"☆115Updated last month
- ☆211Updated 8 months ago
- Efficient Attention for Long Sequence Processing☆89Updated 11 months ago
- A multi-purpose toolkit for table-to-text generation: web interface, Python bindings, CLI commands.☆54Updated 6 months ago
- ☆48Updated 7 months ago
- Are foundation LMs multilingual knowledge bases? (EMNLP 2023)☆18Updated 11 months ago
- A Multilingual Replicable Instruction-Following Model☆94Updated last year
- ☆38Updated last year
- Official repo for SAC3: Reliable Hallucination Detection in Black-Box Language Models via Semantic-aware Cross-check Consistency☆33Updated 5 months ago
- RARR: Researching and Revising What Language Models Say, Using Language Models☆43Updated last year
- A Large-Scale Dataset for Empathetic Response Generation☆40Updated 7 months ago
- Train Llama 2 & 3 on the SQuAD v2 task as an example of how to specialize a generalized (foundation) model.☆47Updated 5 months ago
- ☆28Updated 2 years ago
- ☆333Updated 11 months ago
- ☆26Updated last month
- Text generation using language models with multiple exit heads☆15Updated last year
- ☆36Updated 6 months ago
- Token-level Reference-free Hallucination Detection☆93Updated last year
- LexGLUE: A Benchmark Dataset for Legal Language Understanding in English☆187Updated last year
- DetectLLM: Leveraging Log Rank Information for Zero-Shot Detection of Machine-Generated Text☆25Updated last year
- This repository contains the relevant materials for the tutorial "Legal IR and NLP: the History, Challenges, and State-of-the-Art", held …☆37Updated last year
- Instruct LLMs for flat and nested NER. Fine-tuning Llama and Mistral models for instruction named entity recognition. (Instruction NER)☆78Updated 6 months ago