[TACL, EMNLP 2025 Oral] Code, datasets, and checkpoints for the paper "CRAFT Your Dataset: Task-Specific Synthetic Dataset Generation Through Corpus Retrieval and Augmentation"
☆35Dec 5, 2025Updated 6 months ago
Alternatives and similar repositories for CRAFT
Users that are interested in CRAFT are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆13Oct 31, 2025Updated 7 months ago
- mPLM-Sim: Better Cross-Lingual Similarity and Transfer in Multilingual Pretrained Language Models☆11Jan 19, 2024Updated 2 years ago
- ACE (Adaptive Code Evolution) is an AI-powered system for code analysis and optimization.☆12Mar 25, 2026Updated 2 months ago
- [WWW 2026] 🕸 GlotWeb: Web Indexing for Minority Languages☆17Apr 14, 2026Updated 2 months ago
- SPRINT Toolkit helps you evaluate diverse neural sparse models easily using a single click on any IR dataset.☆48Jul 25, 2023Updated 2 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- ☆13Aug 13, 2024Updated last year
- This GUI aims to simplify the process of converting GGUF files to llamafile format by providing an intuitive and convenient way for users…☆14Jan 2, 2026Updated 5 months ago
- ParCourE - Parallel Corpus Explorer☆12Dec 27, 2021Updated 4 years ago
- Python client for TeraChem Cloud☆13Jun 19, 2025Updated 11 months ago
- Python re-implementation of szabo.f☆10Jul 30, 2015Updated 10 years ago
- 集中管理所有的prompt。☆14Nov 27, 2024Updated last year
- Create Vector Store from Scratch in pure Python.☆13Dec 15, 2023Updated 2 years ago
- ☆26Feb 19, 2023Updated 3 years ago
- ☆23Apr 10, 2024Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- LLM KV Cache compression - K+V dual compression, 73-99% VRAM savings, zero accuracy loss☆57Mar 30, 2026Updated 2 months ago
- Research repository to the publication: Navigating the Design Space of Equivariant Diffusion-Based Generative Models for De Novo 3D Molec…☆14Apr 2, 2024Updated 2 years ago
- Advanced Coding AI Assistant that uses a Gradio interface to stream coding related responses. ChatRAG supports local and API inference an…☆26May 6, 2025Updated last year
- Data Science for Materials - Collection of Open Educational Resources☆17Jun 18, 2025Updated 11 months ago
- ☆15Jan 10, 2022Updated 4 years ago
- Fine Tuning Model for different NLP task☆15Jan 22, 2023Updated 3 years ago
- ☆17Aug 28, 2023Updated 2 years ago
- Experimental tl;dr summaries for datasets on the Hugging Face Hub!☆10Apr 4, 2024Updated 2 years ago
- 嵌入数据仓库,向量存储,向量相似度搜索引擎,向量知识库☆12Apr 24, 2024Updated 2 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- ☆13Oct 31, 2024Updated last year
- The RELX Dataset and Matching the Multilingual Blanks for Cross-Lingual Relation Classification, EMNLP-Findings 2020.☆18Aug 27, 2021Updated 4 years ago
- A simple implementation of ReasonGenRM.☆19Apr 21, 2025Updated last year
- ☆17Jun 18, 2016Updated 9 years ago
- Download, parse, and filter data from Court Listener, part of the FreeLaw projects. Data-ready for The-Pile.☆16Jun 3, 2023Updated 3 years ago
- ☆13Sep 10, 2025Updated 9 months ago
- This is a detailed code demo on how to conduct Full-Param Supervised Fine-tuning (SFT) and DPO (Direct Preference Optimization)☆19Jan 9, 2025Updated last year
- Algoritma ve Programlama Haftalık uygulama Saati Föyleri☆19Oct 9, 2022Updated 3 years ago
- ☆29May 6, 2026Updated last month
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- ☆10Oct 22, 2024Updated last year
- ☆14Oct 31, 2016Updated 9 years ago
- w3act is an annotation and curation tool for building web archive collections☆21Jan 30, 2024Updated 2 years ago
- ☆14Mar 26, 2020Updated 6 years ago
- ALAS: Autonomous Learning Agent System☆18Aug 14, 2025Updated 10 months ago
- A repository to organize materials from the AI4LAM Teach and Learning Working Group☆14May 5, 2023Updated 3 years ago
- Exploration of automated dataset selection approaches at large scales.☆55Mar 4, 2025Updated last year