LAION-AI / Open-Instruction-Generalist
Open Instruction Generalist is an assistant trained on massive synthetic instructions to perform many millions of tasks
☆208Updated last year
Alternatives and similar repositories for Open-Instruction-Generalist:
Users that are interested in Open-Instruction-Generalist are comparing it to the libraries listed below
- ☆179Updated 2 years ago
- All available datasets for Instruction Tuning of Large Language Models☆250Updated last year
- ☆106Updated last year
- Simple next-token-prediction for RLHF☆225Updated last year
- An experimental implementation of the retrieval-enhanced language model☆74Updated 2 years ago
- ☆97Updated last year
- Reverse Instructions to generate instruction tuning data with corpus examples☆209Updated last year
- Inference script for Meta's LLaMA models using Hugging Face wrapper☆110Updated 2 years ago
- Tk-Instruct is a Transformer model that is tuned to solve many NLP tasks by following instructions.☆180Updated 2 years ago
- DSIR large-scale data selection framework for language model training☆246Updated last year
- ☆159Updated 2 years ago
- Repository for analysis and experiments in the BigCode project.☆118Updated last year
- Multipack distributed sampler for fast padding-free training of LLMs☆188Updated 8 months ago
- ☆173Updated last year
- Code and model release for the paper "Task-aware Retrieval with Instructions" by Asai et al.☆162Updated last year
- Python tools for processing the stackexchange data dumps into a text dataset for Language Models☆81Updated last year
- This is the repo for the paper Shepherd -- A Critic for Language Model Generation☆219Updated last year
- The original implementation of Min et al. "Nonparametric Masked Language Modeling" (paper https//arxiv.org/abs/2212.01349)☆157Updated 2 years ago
- ☆270Updated 2 years ago
- a Fine-tuned LLaMA that is Good at Arithmetic Tasks☆177Updated last year
- Inspired by google c4, here is a series of colossal clean data cleaning scripts focused on CommonCrawl data processing. Including Chinese…☆123Updated last year
- Scalable training for dense retrieval models.☆292Updated 2 months ago
- Unofficial implementation of AlpaGasus☆91Updated last year
- Data and code for "DocPrompting: Generating Code by Retrieving the Docs" @ICLR 2023☆243Updated last year
- MultilingualSIFT: Multilingual Supervised Instruction Fine-tuning☆90Updated last year
- This project is an attempt to create a common metric to test LLM's for progress in eliminating hallucinations which is the most serious c…☆222Updated 2 years ago
- ☆237Updated 2 years ago
- Code for the paper Code for the paper InstructDial: Improving Zero and Few-shot Generalization in Dialogue through Instruction Tuning☆100Updated last year
- Code for ACL2023 paper: Pre-Training to Learn in Context☆108Updated 9 months ago
- Open Source WizardCoder Dataset☆158Updated last year