☆261Mar 26, 2025Updated 11 months ago
Alternatives and similar repositories for Data-Provenance-Collection
Users that are interested in Data-Provenance-Collection are comparing it to the libraries listed below
Sorting:
- [ACL 2023] Gradient Ascent Post-training Enhances Language Model Generalization☆29Sep 12, 2024Updated last year
- ☆15Sep 15, 2023Updated 2 years ago
- [NeurIPS 2023 Main Track] This is the repository for the paper titled "Don’t Stop Pretraining? Make Prompt-based Fine-tuning Powerful Lea…☆76Feb 4, 2024Updated 2 years ago
- Interview-based evaluation of LLMs☆25Jan 8, 2025Updated last year
- Link any file anywhere on your computer!☆11May 11, 2024Updated last year
- A clone of OpenAI's Tokenizer page for HuggingFace Models☆46Nov 13, 2023Updated 2 years ago
- an implementation of Self-Extend, to expand the context window via grouped attention☆119Jan 7, 2024Updated 2 years ago
- [ICLR 2024 Spotlight] FLASK: Fine-grained Language Model Evaluation based on Alignment Skill Sets☆218Dec 24, 2023Updated 2 years ago
- EMNLP DiscoEval paper☆43Nov 12, 2019Updated 6 years ago
- Data and tools for generating and inspecting OLMo pre-training data.☆1,411Nov 5, 2025Updated 3 months ago
- Embroid: Unsupervised Prediction Smoothing Can Improve Few-Shot Classification☆11Aug 12, 2023Updated 2 years ago
- LLM red teaming datasets from the paper 'Student-Teacher Prompting for Red Teaming to Improve Guardrails' for the ART of Safety Workshop …☆22Oct 12, 2023Updated 2 years ago
- Radiantloom Email Assist 7B is an email-assistant large language model fine-tuned from Zephyr-7B-Beta, over a custom-curated dataset of 1…☆14Jan 19, 2024Updated 2 years ago
- A simple and efficient baseline for data attribution☆11Nov 10, 2023Updated 2 years ago
- 这里是AA002-John Doe(暂定名)的大作业项目仓库,用于小组管理,资料沉淀等。☆10Apr 27, 2020Updated 5 years ago
- [TACL 2024] Improving Probability-based Prompt Selection Through Unified Evaluation and Analysis☆11Nov 14, 2024Updated last year
- ☆12Mar 25, 2024Updated last year
- benchmarks for evaluating MT models☆11Jun 26, 2024Updated last year
- Run safety benchmarks against AI models and view detailed reports showing how well they performed.☆120Feb 18, 2026Updated last week
- Scaling Data-Constrained Language Models☆342Jun 28, 2025Updated 8 months ago
- Implementation of "ACL'24: When Do LLMs Need Retrieval Augmentation? Mitigating LLMs’ Overconfidence Helps Retrieval Augmentation"☆24Jul 19, 2024Updated last year
- [EMNLP'23] Execution-Based Evaluation for Open Domain Code Generation☆49Dec 22, 2023Updated 2 years ago
- Code for the arXiv preprint "The Unreasonable Effectiveness of Easy Training Data"☆48Jan 17, 2024Updated 2 years ago
- Command-line utility to return Zotero record field values given a Zotero select link, an item key, or even just a file attachment☆10Dec 23, 2023Updated 2 years ago
- ☆11Sep 19, 2025Updated 5 months ago
- Apps that run on modal.com☆13Sep 14, 2025Updated 5 months ago
- 🤖 Code for our EMNLP 2022 paper: "BotsTalk: Machine-sourced Framework for Automatic Curation of Large-scale Multi-skill Dialogue Dataset…☆16Oct 7, 2024Updated last year
- All-in-one repository for Fine-tuning & Pretraining (Large) Language Models☆15Mar 8, 2023Updated 2 years ago
- CMU Linguistic Annotation Backend☆15Sep 22, 2025Updated 5 months ago
- Open-source AI for voice control, rivaling Alexa and Siri☆13Mar 9, 2024Updated last year
- RACE is a multi-dimensional benchmark for code generation that focuses on Readability, mAintainability, Correctness, and Efficiency.☆12Oct 12, 2024Updated last year
- Original PyTorch implementation for AAAI 2021 Paper "Meta-Transfer Learning for Low-Resrouce Abstractive Summarization."☆26Jan 11, 2023Updated 3 years ago
- Lighteval is your all-in-one toolkit for evaluating LLMs across multiple backends☆2,311Feb 20, 2026Updated last week
- [ACL 2024 Demo] Official GitHub repo for UltraEval: An open source framework for evaluating foundation models.☆256Oct 30, 2024Updated last year
- A family of open-sourced Mixture-of-Experts (MoE) Large Language Models☆1,660Mar 8, 2024Updated last year
- A repository to perform self-instruct with a model on HF Hub☆32Sep 29, 2023Updated 2 years ago
- [EMNLP 2023] Official repository for Dialogue Chain-of-Thought Distillation (DONUT & DOCTOR)☆11Nov 15, 2023Updated 2 years ago
- Fine-Tuning Pre-trained Transformers into Decaying Fast Weights☆19Oct 9, 2022Updated 3 years ago
- Extracting Entities with Limited Evidence☆16Dec 26, 2022Updated 3 years ago