Datasets collection and preprocessings framework for NLP extreme multitask learning
☆195Jul 9, 2025Updated 11 months ago
Alternatives and similar repositories for tasksource
Users that are interested in tasksource are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Easy modernBERT fine-tuning and multi-task learning☆65Mar 13, 2026Updated 3 months ago
- Automated Semantic Analysis of Discourse Markers☆11May 30, 2022Updated 4 years ago
- Discourse Based Evaluation of Language Understanding☆21Jan 28, 2023Updated 3 years ago
- Script for processing OpenAI's PRM800K process supervision dataset into an Alpaca-style instruction-response format☆27Jul 12, 2023Updated 2 years ago
- Task Compass: Scaling Multi-task Pre-training with Task Prefix (EMNLP 2022: Findings) (stay tuned & more will be updated)☆22Oct 17, 2022Updated 3 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Implementation of ModernBERT in MLX☆21Jan 7, 2026Updated 5 months ago
- RWKV is a RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best …☆10Nov 3, 2023Updated 2 years ago
- Extracts plain text, language identification and more metadata from WARC records☆23Apr 16, 2026Updated 2 months ago
- [ICML 2023] Exploring the Benefits of Training Expert Language Models over Instruction Tuning☆99Apr 26, 2023Updated 3 years ago
- Anh - LAION's multilingual assistant datasets and models☆28Apr 5, 2023Updated 3 years ago
- Code accompanying the paper Pretraining Language Models with Human Preferences☆181Feb 13, 2024Updated 2 years ago
- utilities for loading and running text embeddings with onnx☆45Aug 16, 2025Updated 9 months ago
- This repo contains code for the paper "Psychologically-informed chain-of-thought prompts for metaphor understanding in large language mod…☆14Apr 28, 2023Updated 3 years ago
- ☆21Oct 6, 2023Updated 2 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Trully flash implementation of DeBERTa disentangled attention mechanism.☆89Feb 10, 2026Updated 4 months ago
- Don't just regulate gradients like in Muon, regulate the weights too☆32Jul 30, 2025Updated 10 months ago
- ☆20Nov 23, 2022Updated 3 years ago
- ☆15Oct 24, 2023Updated 2 years ago
- ☆11Nov 27, 2022Updated 3 years ago
- A framework for benchmarking embedding models in hybrid search scenarios (BM25 + vector search) using Weaviate.☆40May 20, 2026Updated 3 weeks ago
- Notebooks for training universal 0-shot classifiers on many different tasks☆141Dec 28, 2024Updated last year
- ☆20Apr 26, 2026Updated last month
- Mining Discourse Markers for Unsupervised Sentence Representation Learning☆61May 31, 2023Updated 3 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- 🥤🧑🏻🚀Code and dataset for our EMNLP 2023 paper - "SODA: Million-scale Dialogue Distillation with Social Commonsense Contextualization…☆242Jan 23, 2026Updated 4 months ago
- Multi-task modelling extensions for huggingface transformers☆21Mar 3, 2023Updated 3 years ago
- ☆15Apr 26, 2025Updated last year
- Tool to apply Legal Matter Specification Standard (LMSS) to documents☆12Aug 15, 2024Updated last year
- Code to create bugged python scripts for OpenAssistant Training, maintained by https://twitter.com/Cyndesama☆24Jul 23, 2023Updated 2 years ago
- Repo for "Zemi: Learning Zero-Shot Semi-Parametric Language Models from Multiple Tasks" ACL 2023 Findings☆15May 3, 2023Updated 3 years ago
- A tiny BERT for low-resource monolingual models☆32Dec 24, 2025Updated 5 months ago
- We are creating a challenging new benchmark MultiReQA: A Cross-Domain Evaluation for Retrieval Question Answering Models. Retrieval quest…☆31Jul 9, 2020Updated 5 years ago
- ☆17Apr 10, 2024Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Late Interaction Models Training & Retrieval☆842Updated this week
- A project designed to extract relevant metadata from databases and transform it into context for Retrieval-Augmented Generation (RAG) in …☆14Aug 6, 2025Updated 10 months ago
- StAtutory Reasoning Assessment☆17Dec 8, 2022Updated 3 years ago
- Here we collect trick questions and failed tasks for open source LLMs to improve them.☆32Apr 20, 2023Updated 3 years ago
- One stop shop for all things carp☆58Sep 9, 2022Updated 3 years ago
- Open-Retrieval Conversational Machine Reading: A new setting & OR-ShARC dataset☆13Nov 19, 2022Updated 3 years ago
- Next-generation Punkt sentence boundary detection with zero dependencies☆31Nov 18, 2025Updated 6 months ago