Datasets collection and preprocessings framework for NLP extreme multitask learning
☆193Jul 9, 2025Updated 8 months ago
Alternatives and similar repositories for tasksource
Users that are interested in tasksource are comparing it to the libraries listed below
Sorting:
- Easy modernBERT fine-tuning and multi-task learning☆64Mar 13, 2026Updated last week
- Automated Semantic Analysis of Discourse Markers☆11May 30, 2022Updated 3 years ago
- Discourse Based Evaluation of Language Understanding☆21Jan 28, 2023Updated 3 years ago
- Script for processing OpenAI's PRM800K process supervision dataset into an Alpaca-style instruction-response format☆27Jul 12, 2023Updated 2 years ago
- Resources accompanying the "Zero-Shot Recommendation as Language Modeling" paper (ECIR2022)☆14May 25, 2023Updated 2 years ago
- Task Compass: Scaling Multi-task Pre-training with Task Prefix (EMNLP 2022: Findings) (stay tuned & more will be updated)☆22Oct 17, 2022Updated 3 years ago
- Implementation of ModernBERT in MLX☆20Jan 7, 2026Updated 2 months ago
- Fast whitespace correction with Transformers☆17Aug 22, 2025Updated 7 months ago
- [ICML 2023] Exploring the Benefits of Training Expert Language Models over Instruction Tuning☆98Apr 26, 2023Updated 2 years ago
- Code accompanying the paper Pretraining Language Models with Human Preferences☆180Feb 13, 2024Updated 2 years ago
- utilities for loading and running text embeddings with onnx☆45Aug 16, 2025Updated 7 months ago
- This repo contains code for the paper "Psychologically-informed chain-of-thought prompts for metaphor understanding in large language mod…☆14Apr 28, 2023Updated 2 years ago
- ☆21Oct 6, 2023Updated 2 years ago
- Trully flash implementation of DeBERTa disentangled attention mechanism.☆81Feb 10, 2026Updated last month
- ☆20Nov 23, 2022Updated 3 years ago
- Don't just regulate gradients like in Muon, regulate the weights too☆31Jul 30, 2025Updated 7 months ago
- ☆15Oct 24, 2023Updated 2 years ago
- ☆11Nov 27, 2022Updated 3 years ago
- ☆14Apr 29, 2025Updated 10 months ago
- A framework for benchmarking embedding models in hybrid search scenarios (BM25 + vector search) using Weaviate.☆38Mar 14, 2026Updated last week
- Notebooks for training universal 0-shot classifiers on many different tasks☆140Dec 28, 2024Updated last year
- ☆19Sep 16, 2025Updated 6 months ago
- Flexible, efficient, and context-aware generation from large unstructured knowledge sources.☆17May 7, 2024Updated last year
- Mining Discourse Markers for Unsupervised Sentence Representation Learning☆61May 31, 2023Updated 2 years ago
- 🥤🧑🏻🚀Code and dataset for our EMNLP 2023 paper - "SODA: Million-scale Dialogue Distillation with Social Commonsense Contextualization…☆240Jan 23, 2026Updated last month
- Multi-task modelling extensions for huggingface transformers☆21Mar 3, 2023Updated 3 years ago
- ☆15Apr 26, 2025Updated 10 months ago
- Tool to apply Legal Matter Specification Standard (LMSS) to documents☆12Aug 15, 2024Updated last year
- Code to create bugged python scripts for OpenAssistant Training, maintained by https://twitter.com/Cyndesama☆24Jul 23, 2023Updated 2 years ago
- Using modal.com to process FineWeb-edu data☆20Apr 5, 2025Updated 11 months ago
- Correction of spaces with character-based neural language models.☆13Aug 23, 2022Updated 3 years ago
- Repo for "Zemi: Learning Zero-Shot Semi-Parametric Language Models from Multiple Tasks" ACL 2023 Findings☆15May 3, 2023Updated 2 years ago
- A tiny BERT for low-resource monolingual models☆31Dec 24, 2025Updated 2 months ago
- ☆17Apr 10, 2024Updated last year
- We are creating a challenging new benchmark MultiReQA: A Cross-Domain Evaluation for Retrieval Question Answering Models. Retrieval quest…☆31Jul 9, 2020Updated 5 years ago
- Replication Materials for "Crowd-Sourced Text Analysis" APSR (2016) 110(2): 278-295.☆11Oct 28, 2017Updated 8 years ago
- A project designed to extract relevant metadata from databases and transform it into context for Retrieval-Augmented Generation (RAG) in …☆14Aug 6, 2025Updated 7 months ago
- StAtutory Reasoning Assessment☆16Dec 8, 2022Updated 3 years ago
- Here we collect trick questions and failed tasks for open source LLMs to improve them.☆32Apr 20, 2023Updated 2 years ago