[ACL2025 Findings] Official code for MIG: Automatic Data Selection for Instruction Tuning by Maximizing Information Gain in Semantic Space
☆28Aug 30, 2025Updated 7 months ago
Alternatives and similar repositories for MIG
Users that are interested in MIG are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆28Jul 11, 2024Updated last year
- InsTag: A Tool for Data Analysis in LLM Supervised Fine-tuning☆285Aug 20, 2023Updated 2 years ago
- Danmuku dataset☆12Jul 7, 2023Updated 2 years ago
- [ACL 2023] Multi-source Semantic Graph-based Multimodal Sarcasm Explanation Generation.☆10Dec 19, 2024Updated last year
- CVPR2022:Learning from Untrimmed Videos: Self-Supervised Video Representation Learning with Hierarchical Consistency☆18Aug 10, 2022Updated 3 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- ☆17Oct 2, 2024Updated last year
- Official repository for paper MG-LLaVA: Towards Multi-Granularity Visual Instruction Tuning(https://arxiv.org/abs/2406.17770).☆159Sep 27, 2024Updated last year
- ☆13Apr 18, 2024Updated last year
- Variance Covariance Regularization☆14Jun 22, 2023Updated 2 years ago
- ☆18May 7, 2025Updated 11 months ago
- Toolkit for Universal Retrieval, such as text retrieval, item recommendation, image retrieval, etc.☆17Sep 15, 2025Updated 7 months ago
- ELECTRA MODEL NLP☆13Apr 8, 2020Updated 6 years ago
- [NAACL'24] Self-data filtering of LLM instruction-tuning data using a novel perplexity-based difficulty score, without using any other mo…☆412Jun 25, 2025Updated 9 months ago
- A research project exploring fine-tuning BERT-style models for text generation☆40Nov 30, 2025Updated 4 months ago
- Serverless GPU API endpoints on Runpod - Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- ☆21Jul 25, 2025Updated 8 months ago
- How to really install tensorflow-gpu from source on a clean instance of Ubuntu☆11Sep 29, 2023Updated 2 years ago
- ☆26Aug 24, 2022Updated 3 years ago
- ☆29Nov 27, 2021Updated 4 years ago
- ☆13Dec 25, 2018Updated 7 years ago
- Codes of BaiLian (POJ), Luogu, LeetCode & Course OJ☆16Dec 21, 2019Updated 6 years ago
- Bidirectional Likelihood Estimation with Multi-Modal Large Language Models for Text-Video Retrieval (ICCV 2025 Highlight)☆21Aug 1, 2025Updated 8 months ago
- Official PyTorch implementation of the paper "Equivariant Image Modeling"(https://arxiv.org/abs/2503.18948)☆36Aug 1, 2025Updated 8 months ago
- Transformer, Evolved Transformer Model☆10Jul 6, 2019Updated 6 years ago
- Deploy open-source AI quickly and easily - Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- Master the techniques of function-calling and structured data extraction with LLMs. Learn to enhance LLM capabilities, integrate web serv…☆12Jun 29, 2024Updated last year
- repo for paper https://arxiv.org/abs/2504.13837☆336Dec 17, 2025Updated 4 months ago
- named entity recognition combined with rule from entity dict☆13Aug 25, 2020Updated 5 years ago
- Official repository for MATES: Model-Aware Data Selection for Efficient Pretraining with Data Influence Models [NeurIPS 2024]☆79Nov 14, 2024Updated last year
- [COLM 2025] EvalTree: Profiling Language Model Weaknesses via Hierarchical Capability Trees☆31Jul 11, 2025Updated 9 months ago
- [ICLR 2025] MiniPLM: Knowledge Distillation for Pre-Training Language Models☆75Nov 23, 2024Updated last year
- ☆14Oct 30, 2023Updated 2 years ago
- A retrieve and edit approach to generate sarcasm by reversing valence and adding incongruent common sense context☆32Mar 27, 2021Updated 5 years ago
- ☆12Dec 13, 2023Updated 2 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- 2020语言与智能技术竞赛:关系抽取任务☆10Mar 19, 2020Updated 6 years ago
- ConvGQR: Generative Query Reformulation for Conversational Search. A codebase for ACL 2023 accepted paper.☆34Mar 5, 2024Updated 2 years ago
- 2019搜狐第三届内容识别挑战赛rank10☆11Oct 17, 2019Updated 6 years ago
- Informative Conversational Query Rewriting☆39Jan 29, 2024Updated 2 years ago
- 2019搜狐校园算法大赛。决赛解决方案ppt、实体lgb单模代码☆71Jun 26, 2019Updated 6 years ago
- ☆12Dec 29, 2016Updated 9 years ago
- CTE: Contextualized Table Extraction Dataset☆17Feb 23, 2023Updated 3 years ago