dsdanielpark / open-llm-datasetsLinks

Repository for organizing datasets and papers used in Open LLM.

☆99

Alternatives and similar repositories for open-llm-datasets

Users that are interested in open-llm-datasets are comparing it to the libraries listed below

Sorting:

nlpxucan / evol-instruct
☆270Updated 2 years ago
LudwigStumpp / llm-leaderboard
A joint community effort to create one central leaderboard for LLMs.
☆303Updated 10 months ago
LLM360 / amber-data-prep
Data preparation code for Amber 7B LLM
☆91Updated last year
huggingface / llm_training_handbook
An open collection of methodologies to help with successful training of large language models.
☆502Updated last year
dsdanielpark / open-llm-leaderboard-report
Weekly visualization report of Open LLM model performance based on 4 metrics.
☆87Updated last year
swj0419 / detect-pretrain-code-contamination
☆76Updated last year
kaistAI / SelFee
Official codebase for "SelFee: Iterative Self-Revising LLM Empowered by Self-Feedback Generation"
☆227Updated 2 years ago
taprosoft / llm_finetuning
Convenient wrapper for fine-tuning and inference of Large Language Models (LLMs) with several quantization techniques (GTPQ, bitsandbytes…
☆146Updated last year
jshuadvd / LongRoPE
Implementation of the LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens Paper
☆147Updated 11 months ago
Gentopia-AI / Gentopia
Build Hierarchical Autonomous Agents through Config. Collaborative Growth of Specialized Agents.
☆319Updated last year
OpenLemur / Lemur
[ICLR 2024] Lemur: Open Foundation Models for Language Agents
☆551Updated last year
h2oai / h2o-wizardlm
Open-Source Implementation of WizardLM to turn documents into Q:A pairs for LLM fine-tuning
☆311Updated 8 months ago
imagination-research / sot
[ICLR 2024] Skeleton-of-Thought: Prompting LLMs for Efficient Parallel Generation
☆171Updated last year
salesforce / DialogStudio
DialogStudio: Towards Richest and Most Diverse Unified Dataset Collection and Instruction-Aware Models for Conversational AI
☆507Updated 5 months ago
Gryphe / BlockMerge_Gradient
Merge Transformers language models by use of gradient parameters.
☆206Updated 11 months ago
DachengLi1 / LongChat
Official repository for LongChat and LongEval
☆523Updated last year
Digitous / LLM-SLERP-Merge
Spherical Merge Pytorch/HF format Language Models with minimal feature loss.
☆132Updated last year
jondurbin / bagel
A bagel, with everything.
☆322Updated last year
daniel-furman / sft-demos
Lightweight demos for finetuning LLMs. Powered by 🤗 transformers and open-source datasets.
☆77Updated 8 months ago
declare-lab / instruct-eval
This repository contains code to quantitatively evaluate instruction-tuned models such as Alpaca and Flan-T5 on held-out tasks.
☆546Updated last year
aymeric-roucher / agent_reasoning_benchmark
🔧 Compare how Agent systems perform on several benchmarks. 📊🚀
☆98Updated 8 months ago
metame-ai / awesome-llm-plaza
awesome llm plaza: daily tracking all sorts of awesome topics of llm, e.g. llm for coding, robotics, reasoning, multimod etc.
☆204Updated this week
wang-research-lab / agentinstruct
Code repo for "Agent Instructs Large Language Models to be General Zero-Shot Reasoners"
☆113Updated 10 months ago
LLM360 / amber-train
Pre-training code for Amber 7B LLM
☆166Updated last year
mzbac / llama2-fine-tune
Scripts for fine-tuning Llama2 via SFT and DPO.
☆200Updated last year
catid / self-discover
Implementation of Google's SELF-DISCOVER
☆296Updated 11 months ago
nexusflowai / NexusRaven
NexusRaven-13B, a new SOTA Open-Source LLM for function calling. This repo contains everything for reproducing our evaluation on NexusRav…
☆317Updated last year
JinjieNi / MixEval
The official evaluation suite and dynamic data release for MixEval.
☆242Updated 8 months ago
FranxYao / Long-Context-Data-Engineering
Implementation of paper Data Engineering for Scaling Language Models to 128K Context
☆464Updated last year
Re-Align / URIAL
☆310Updated last year