[ACL 2025 Main] Official Repo for Paper "Measuring Data Diversity for Instruction Tuning: A Systematic Analysis and A Reliable Metric"
☆36Feb 10, 2026Updated last month
Alternatives and similar repositories for NovelSum
Users that are interested in NovelSum are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A Recipe for Building LLM Reasoners to Solve Complex Instructions☆31Oct 9, 2025Updated 5 months ago
- Code for "Mind Your Inflections! Improving NLP for Non-Standard Englishes with Base-Inflection Encoding" (EMNLP 2020).☆11May 1, 2025Updated 10 months ago
- Split bib files for anthology bibliography for overleaf☆11Aug 25, 2024Updated last year
- Clustering and Ranking: Diversity-preserved Instruction Selection through Expert-aligned Quality Estimation☆90Nov 13, 2024Updated last year
- Original code base for On Pretraining Data Diversity for Self-Supervised Learning☆14Dec 30, 2024Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- A better Alpaca Model Trained with Less Data (only 9k instructions of the original set)☆24Jul 26, 2024Updated last year
- Reasoning or Memorization? Unreliable Results of Reinforcement Learning Due to Data Contamination.☆22Jul 18, 2025Updated 8 months ago
- Self-Evolved Diverse Data Sampling for Efficient Instruction Tuning☆87Dec 14, 2023Updated 2 years ago
- Qwen-WisdomVast is a large model trained on 1 million high-quality Chinese multi-turn SFT data, 200,000 English multi-turn SFT data, and …☆18Apr 12, 2024Updated last year
- Surgically de-slop LLMs☆14Jun 1, 2025Updated 9 months ago
- PyTorch implementation of PtrNet to solve sorting problem.☆12Dec 19, 2017Updated 8 years ago
- ☆13Jul 25, 2025Updated 8 months ago
- Exploration of automated dataset selection approaches at large scales.☆53Mar 4, 2025Updated last year
- AIME API Server - Scalable AI Model Inference API Server☆15Sep 19, 2025Updated 6 months ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Visual Bidirectional Kernelized Network for Visual Question Answering☆11Jul 17, 2017Updated 8 years ago
- Code for the "Long Context Needs Some R&R" paper.☆12Mar 11, 2024Updated 2 years ago
- This repository contains codes for *Sem 2023 paper “Generative Data Augmentation for Aspect Sentiment Quad Prediction”.☆11May 30, 2023Updated 2 years ago
- ☆11Mar 10, 2017Updated 9 years ago
- ☆28Mar 20, 2024Updated 2 years ago
- ☆12Jan 7, 2020Updated 6 years ago
- This is a Utrecht University dissertation template for LaTeX☆22Jul 31, 2025Updated 7 months ago
- EANN(Pytorch)☆10Mar 12, 2022Updated 4 years ago
- ☆27Jul 18, 2025Updated 8 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- https://footprints.baulab.info☆18Oct 4, 2024Updated last year
- NIILC QA data☆18Nov 20, 2015Updated 10 years ago
- STAR: Similarity-guided Teacher-Assisted Refinement for Super-Tiny Function Calling Models☆42Updated this week
- CITE: A Corpus of Image-Text Discourse Relations☆13Apr 7, 2019Updated 6 years ago
- [COLING 2025] Official Repo for Paper "Beyond Boundaries: Learning Universal Entity Taxonomy across Datasets and Languages for Open Named…☆27Feb 5, 2026Updated last month
- Code and data for the NAACL 2021 paper: "XFORMAL: A Benchmark for Multilingual Formality Style Transfer"☆12Jun 7, 2021Updated 4 years ago
- Code and data for paper "Large language models can rate news outlet credibility"☆13Aug 10, 2024Updated last year
- Unsupervised diverse image generation via GANs: Partition Guided Mixture of Generative Adversarial Networks☆13Nov 3, 2021Updated 4 years ago
- Code, benchmark and environment for "OS-Sentinel: Towards Safety-Enhanced Mobile GUI Agents via Hybrid Validation in Realistic Workflows"☆40Nov 10, 2025Updated 4 months ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- We enable LLM with personalization capability☆11Nov 16, 2023Updated 2 years ago
- Code for "HiChunk: Evaluating and Enhancing Retrieval-Augmented Generation with Hierarchical Chunking"☆90Nov 18, 2025Updated 4 months ago
- Download, parse, and filter data from Phil Papers. Data-ready for The-Pile.☆19Aug 28, 2023Updated 2 years ago
- ☆11Mar 13, 2023Updated 3 years ago
- PyTorch study☆14Oct 16, 2017Updated 8 years ago
- ☆10Apr 24, 2022Updated 3 years ago
- Fuzzy Aggregators and Similarity Into a Logic Language☆26Sep 12, 2024Updated last year