Dahoas / gpt-neox-finetuningLinks
☆15Updated 3 years ago
Alternatives and similar repositories for gpt-neox-finetuning
Users that are interested in gpt-neox-finetuning are comparing it to the libraries listed below
Sorting:
- Tool to apply Legal Matter Specification Standard (LMSS) to documents☆12Updated last year
- Completion After Prompt Probability. Make your LLM make a choice☆82Updated last year
- A TextTiling-based algorithm for text segmentation (aka topic segmentation) that uses neural sentence encoders, as well as extractive sum…☆51Updated 2 years ago
- Datasets collection and preprocessings framework for NLP extreme multitask learning☆189Updated 5 months ago
- 📚 Datasets and models for instruction-tuning☆238Updated 2 years ago
- ☆172Updated 10 months ago
- Efficient few-shot learning with cross-encoders.☆60Updated last year
- Official implementation of the paper "CoEdIT: Text Editing by Task-Specific Instruction Tuning" (EMNLP 2023)☆134Updated last year
- ☆17Updated last year
- Experiments with generating opensource language model assistants☆97Updated 2 years ago
- Tools for managing datasets for governance and training.☆87Updated last week
- A dataset featuring diverse dialogues between two ChatGPT (gpt-3.5-turbo) instances with system messages written by GPT-4. Covering vario…☆164Updated 2 years ago
- Repo for the Belebele dataset, a massively multilingual reading comprehension dataset.☆338Updated last year
- simpleT5 is built on top of PyTorch-lightning⚡️ and Transformers🤗 that lets you quickly train your T5 models.☆400Updated 2 years ago
- Reverse Instructions to generate instruction tuning data with corpus examples☆216Updated last year
- A collection of datasets for language model pretraining including scripts for downloading, preprocesssing, and sampling.☆64Updated last year
- 80x faster and 95% accurate language identification with Fasttext☆163Updated last year
- Developing tools to automatically analyze datasets☆75Updated last year
- [WIP] A 🔥 interface for running code in the cloud☆86Updated 2 years ago
- Exploring finetuning public checkpoints on filter 8K sequences on Pile☆116Updated 2 years ago
- This repository contains code used for our Multi Sentence Inference NAACL'22 paper.☆12Updated 2 years ago
- Using open source LLMs to build synthetic datasets for direct preference optimization☆71Updated last year
- A library for squeakily cleaning and filtering language datasets.☆49Updated 2 years ago
- Tune MPTs☆84Updated 2 years ago
- PRODIGy is a collection of dialogues in which each conversation is aligned with speaker profile representations.☆19Updated 11 months ago
- Simply, faster, sentence-transformers☆143Updated last year
- A Python Search Engine for Humans 🥸☆241Updated last year
- Seed Machine Translation Data☆33Updated last year
- 🤝 Trade any tensors over the network☆30Updated 2 years ago
- Training & Implementation of chatbots leveraging GPT-like architecture with the aitextgen package to enable dynamic conversations.☆49Updated 3 years ago