[ACL2025 Findings] Official code for MIG: Automatic Data Selection for Instruction Tuning by Maximizing Information Gain in Semantic Space
☆28Aug 30, 2025Updated 6 months ago
Alternatives and similar repositories for MIG
Users that are interested in MIG are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆27Jul 11, 2024Updated last year
- PyTorch code for Improving Commonsense in Vision-Language Models via Knowledge Graph Riddles (DANCE)☆23Nov 29, 2022Updated 3 years ago
- Clustering and Ranking: Diversity-preserved Instruction Selection through Expert-aligned Quality Estimation☆90Nov 13, 2024Updated last year
- InsTag: A Tool for Data Analysis in LLM Supervised Fine-tuning☆285Aug 20, 2023Updated 2 years ago
- Danmuku dataset☆11Jul 7, 2023Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- [ACL 2023] Multi-source Semantic Graph-based Multimodal Sarcasm Explanation Generation.☆10Dec 19, 2024Updated last year
- Application and blog explaining my interpretations of In-run Data Shapley☆30Jan 30, 2025Updated last year
- ☆17Oct 2, 2024Updated last year
- Variance Covariance Regularization☆14Jun 22, 2023Updated 2 years ago
- Code for "In-Context Former: Lightning-fast Compressing Context for Large Language Model" (Findings of EMNLP 2024)☆21Nov 21, 2024Updated last year
- Toolkit for Universal Retrieval, such as text retrieval, item recommendation, image retrieval, etc.☆17Sep 15, 2025Updated 6 months ago
- It is the implementation of paper "Multi-Modal Sarcasm Detection in Twitter with Hierarchical Fusion Model"☆18Feb 19, 2021Updated 5 years ago
- ELECTRA MODEL NLP☆13Apr 8, 2020Updated 5 years ago
- [NAACL'24] Self-data filtering of LLM instruction-tuning data using a novel perplexity-based difficulty score, without using any other mo…☆416Jun 25, 2025Updated 9 months ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- [CIKM'23] Time-aware Graph Structure Learning via Sequence Prediction on Temporal Graphs☆21Aug 9, 2023Updated 2 years ago
- ☆21Jul 25, 2025Updated 8 months ago
- How to really install tensorflow-gpu from source on a clean instance of Ubuntu☆11Sep 29, 2023Updated 2 years ago
- ☆26Aug 24, 2022Updated 3 years ago
- ☆13Dec 25, 2018Updated 7 years ago
- Codes of BaiLian (POJ), Luogu, LeetCode & Course OJ☆16Dec 21, 2019Updated 6 years ago
- Implementation of SNAIL(A Simple Neural Attentive Meta-Learner) with Gluon☆12Feb 22, 2019Updated 7 years ago
- Bidirectional Likelihood Estimation with Multi-Modal Large Language Models for Text-Video Retrieval (ICCV 2025 Highlight)☆21Aug 1, 2025Updated 7 months ago
- Official PyTorch implementation of the paper "Equivariant Image Modeling"(https://arxiv.org/abs/2503.18948)☆36Aug 1, 2025Updated 7 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- repo for paper https://arxiv.org/abs/2504.13837☆333Dec 17, 2025Updated 3 months ago
- Training and Inference Notebooks for the RedPajama (OpenLlama) models☆19May 18, 2023Updated 2 years ago
- The official GitHub page for the survey paper "A Survey on LLM Symbolic Reasoning". And this paper is under review.☆26Updated this week
- [ICLR 2025] MiniPLM: Knowledge Distillation for Pre-Training Language Models☆74Nov 23, 2024Updated last year
- [COLM 2025] EvalTree: Profiling Language Model Weaknesses via Hierarchical Capability Trees☆31Jul 11, 2025Updated 8 months ago
- ☆14Oct 30, 2023Updated 2 years ago
- A retrieve and edit approach to generate sarcasm by reversing valence and adding incongruent common sense context☆32Mar 27, 2021Updated 5 years ago
- [ACL2023] Preserving Commonsense Knowledge from Pre-trained Language Models via Causal Inference☆24Dec 25, 2023Updated 2 years ago
- ☆12Dec 13, 2023Updated 2 years ago
- NordVPN Threat Protection Pro™ • AdTake your cybersecurity to the next level. Block phishing, malware, trackers, and ads. Lightweight app that works with all browsers.
- ☆11Nov 27, 2018Updated 7 years ago
- 2020语言与智能技术竞赛:关系抽取任务☆10Mar 19, 2020Updated 6 years ago
- CVPR2023☆18Mar 18, 2023Updated 3 years ago
- ConvGQR: Generative Query Reformulation for Conversational Search. A codebase for ACL 2023 accepted paper.☆34Mar 5, 2024Updated 2 years ago
- Informative Conversational Query Rewriting☆38Jan 29, 2024Updated 2 years ago
- 2019搜狐第三届内容识别挑战赛rank10☆11Oct 17, 2019Updated 6 years ago
- 2019搜狐校园算法大赛。决赛解决方案ppt、实体lgb单模代码☆71Jun 26, 2019Updated 6 years ago