akoksal/LongForm

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/akoksal/LongForm)

akoksal / LongForm

Reverse Instructions to generate instruction tuning data with corpus examples

☆215

Alternatives and similar repositories for LongForm

Users that are interested in LongForm are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

leonweber / pedl
View on GitHub
Search the biomedical literature for protein interactions and protein associations
☆11Nov 24, 2023Updated 2 years ago
gururise / AlpacaDataCleaned
View on GitHub
Alpaca dataset from Stanford, cleaned and curated
☆1,602Mar 7, 2026Updated 4 months ago
ahmetustun / hyperx
View on GitHub
☆21Dec 5, 2022Updated 3 years ago
mobarski / alpaca-libre
View on GitHub
Reimplementation of the task generation part from the Alpaca paper
☆118Apr 4, 2023Updated 3 years ago
PicoCreator / RWKV-LM-LoRA
View on GitHub
RWKV is a RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best …
☆10Nov 3, 2023Updated 2 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
Instruction-Tuning-with-GPT-4 / GPT-4-LLM
View on GitHub
Instruction Tuning with GPT-4
☆4,332Jun 11, 2023Updated 3 years ago
google-research / FLAN
View on GitHub
☆1,565Jul 2, 2026Updated 3 weeks ago
malteos / clp-transfer
View on GitHub
Efficient Language Model Training through Cross-Lingual and Progressive Transfer Learning
☆30Jan 25, 2023Updated 3 years ago
nlpyang / NoisySumm
View on GitHub
Codes for NAACL 2021 paper 'Noisy Self-Knowledge Distillation for Text Summarization'
☆24Jul 27, 2021Updated 4 years ago
cwhy / rwkv-decon
View on GitHub
Trying to deconstruct RWKV in understandable terms
☆14May 6, 2023Updated 3 years ago
lucidrains / CoLT5-attention
View on GitHub
Implementation of the conditionally routed attention in the CoLT5 architecture, in Pytorch
☆230Sep 6, 2024Updated last year
RWKV / RWKV-cpp-node
View on GitHub
Node.js implementation binding for the RWKV.cpp module
☆22Aug 2, 2023Updated 2 years ago
r-three / RAD
View on GitHub
Reference implementation for Reward-Augmented Decoding: Efficient Controlled Text Generation With a Unidirectional Reward Model
☆45Oct 1, 2025Updated 9 months ago
nayohan / SentiCSE
View on GitHub
[COLING 2024] SentiCSE: A Sentiment-aware Contrastive Sentence Embedding Framework with Sentiment-guided Textual Similarity
☆13May 8, 2024Updated 2 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
deep-diver / LLM-Serve
View on GitHub
This repository provides a framework to serve LLM(Large Language Model) based applications such as Chatbot.
☆18Apr 20, 2023Updated 3 years ago
mbzuai-nlp / LaMini-LM
View on GitHub
LaMini-LM: A Diverse Herd of Distilled Models from Large-Scale Instructions
☆822May 6, 2023Updated 3 years ago
yizhongw / self-instruct
View on GitHub
Aligning pretrained language models with instruction data generated by themselves.
☆4,606Mar 27, 2023Updated 3 years ago
huggingface / datablations
View on GitHub
Scaling Data-Constrained Language Models
☆344Jun 28, 2025Updated last year
declare-lab / instruct-eval
View on GitHub
This repository contains code to quantitatively evaluate instruction-tuned models such as Alpaca and Flan-T5 on held-out tasks.
☆552Mar 10, 2024Updated 2 years ago
microsoft / KID
View on GitHub
Knowledge Infused Decoding
☆70Dec 31, 2023Updated 2 years ago
TheDuckAI / arb
View on GitHub
Advanced Reasoning Benchmark Dataset for LLMs
☆48Nov 19, 2023Updated 2 years ago
seonghyeonye / Flipped-Learning
View on GitHub
[ICLR 2023] Guess the Instruction! Flipped Learning Makes Language Models Stronger Zero-Shot Learners
☆117Jun 28, 2025Updated last year
taylorai / galactic
View on GitHub
data cleaning and curation for unstructured text
☆329Aug 6, 2024Updated last year
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
Re-Align / URIAL
View on GitHub
☆316Jun 9, 2024Updated 2 years ago
facebookresearch / bart_ls
View on GitHub
Long-context pretrained encoder-decoder models
☆97Oct 28, 2022Updated 3 years ago
orhonovich / unnatural-instructions
View on GitHub
☆181Feb 23, 2023Updated 3 years ago
ZeldaHuang / rwkv-cpp-server
View on GitHub
Easily deploy your rwkv model
☆19May 5, 2023Updated 3 years ago
argilla-io / distilabel-spin-dibt
View on GitHub
Repository containing the SPIN experiments on the DIBT 10k ranked prompts
☆24Mar 12, 2024Updated 2 years ago
LuoXiaoHeics / Continual-Tune
View on GitHub
☆10Feb 6, 2025Updated last year
HanNight / RE-T5
View on GitHub
Code and data for "Retrieval Enhanced Model for Commonsense Generation" (ACL-IJCNLP 2021).
☆29Dec 31, 2021Updated 4 years ago
wade3han / champagne
View on GitHub
An official codebase for paper " CHAMPAGNE: Learning Real-world Conversation from Large-Scale Web Videos (ICCV 23)"
☆52Aug 13, 2023Updated 2 years ago
jxjessieli / contextual-distortion-parser
View on GitHub
[ACL 2023] Contextual Distortion Reveals Constituency: Mask Language Models are Implicit Parsers.
☆14Jun 3, 2023Updated 3 years ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
joeljang / ELM
View on GitHub
[ICML 2023] Exploring the Benefits of Training Expert Language Models over Instruction Tuning
☆99Apr 26, 2023Updated 3 years ago
psunlpgroup / MACSum
View on GitHub
Dataset, metrics, and models for TACL 2023 paper MACSUM: Controllable Summarization with Mixed Attributes.
☆34Jul 25, 2023Updated 3 years ago
kyleliang919 / Long-context-transformers
View on GitHub
Exploring finetuning public checkpoints on filter 8K sequences on Pile
☆116Mar 22, 2023Updated 3 years ago
FranxYao / Long-Context-Data-Engineering
View on GitHub
Implementation of paper Data Engineering for Scaling Language Models to 128K Context
☆502Mar 19, 2024Updated 2 years ago
harrisonvanderbyl / godot-rwkv
View on GitHub
RWKV godot interface module
☆61Jun 13, 2024Updated 2 years ago
jason9693 / ETA4LLMs
View on GitHub
Calculating Expected Time for training LLM.
☆39Apr 17, 2023Updated 3 years ago
VikParuchuri / textbook_quality
View on GitHub
Generate textbook-quality synthetic LLM pretraining data
☆508Oct 19, 2023Updated 2 years ago