Dahoas / gpt-neox-finetuning
☆16Updated 3 years ago
Alternatives and similar repositories for gpt-neox-finetuning:
Users that are interested in gpt-neox-finetuning are comparing it to the libraries listed below
- Tools for managing datasets for governance and training.☆85Updated 2 months ago
- Tool to apply Legal Matter Specification Standard (LMSS) to documents☆13Updated 8 months ago
- This repository contains code used for our Multi Sentence Inference NAACL'22 paper.☆12Updated 2 years ago
- Code and data for "StructLM: Towards Building Generalist Models for Structured Knowledge Grounding" (COLM 2024)☆76Updated 6 months ago
- Using short models to classify long texts☆21Updated 2 years ago
- A library for squeakily cleaning and filtering language datasets.☆47Updated last year
- Datasets collection and preprocessings framework for NLP extreme multitask learning☆179Updated 3 months ago
- [Added T5 support to TRLX] A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)☆47Updated 2 years ago
- Completion After Prompt Probability. Make your LLM make a choice☆76Updated 5 months ago
- A TextTiling-based algorithm for text segmentation (aka topic segmentation) that uses neural sentence encoders, as well as extractive sum…☆46Updated 2 years ago
- Using open source LLMs to build synthetic datasets for direct preference optimization☆61Updated last year
- A collection of datasets for language model pretraining including scripts for downloading, preprocesssing, and sampling.☆57Updated 8 months ago
- QLoRA with Enhanced Multi GPU Support☆37Updated last year
- Experiments with generating opensource language model assistants☆97Updated last year
- Reimplementation of the task generation part from the Alpaca paper☆119Updated 2 years ago
- minimal pytorch implementation of bm25 (with sparse tensors)☆100Updated last year
- ☆24Updated last year
- Efficient few-shot learning with cross-encoders.☆51Updated last year
- 🤗 Disaggregators: Curated data labelers for in-depth analysis.☆65Updated 2 years ago
- ☆22Updated 3 years ago
- ☆22Updated last year
- ☆53Updated 4 months ago
- ☆40Updated last year
- Source codes for the paper "Bounding the Capabilities of Large Language Models in Open Text Generation with Prompt Constraints"☆28Updated 2 years ago
- This repository contains an easy and intuitive approach to use SetFit in combination with spaCy.☆79Updated last year
- A Multilingual Dataset for Parsing Realistic Task-Oriented Dialogs☆114Updated 2 years ago
- ☆32Updated 2 years ago
- Training & Implementation of chatbots leveraging GPT-like architecture with the aitextgen package to enable dynamic conversations.☆49Updated 2 years ago
- Writing Blog Posts with Generative Feedback Loops!☆47Updated last year
- Finetune mistral-7b-instruct for sentence embeddings☆81Updated 11 months ago