tianlinxu312 / Everything-about-LLMsLinks

A work in progress. Trying to write about all interesting or necessary pieces in the current development of LLMs and generative AI. Gradually adding more topics.

☆195

Alternatives and similar repositories for Everything-about-LLMs

Users that are interested in Everything-about-LLMs are comparing it to the libraries listed below

Sorting:

rasbt / dora-from-scratch
LoRA and DoRA from Scratch Implementations
☆207Updated last year
FranxYao / Long-Context-Data-Engineering
Implementation of paper Data Engineering for Scaling Language Models to 128K Context
☆467Updated last year
catid / dora
Implementation of DoRA
☆301Updated last year
SkyworkAI / Skywork-MoE
Skywork-MoE: A Deep Dive into Training Techniques for Mixture-of-Experts Language Models
☆136Updated last year
lucidrains / llama-qrlhf
Implementation of the Llama architecture with RLHF + Q-learning
☆166Updated 6 months ago
LLM360 / amber-train
Pre-training code for Amber 7B LLM
☆167Updated last year
Cohere-Labs-Community / parameter-efficient-moe
☆269Updated last year
OpenNLPLab / TransnormerLLM
Official implementation of TransNormerLLM: A Faster and Better LLM
☆247Updated last year
SumanthRH / tokenization
A comprehensive deep dive into the world of tokens
☆225Updated last year
Strivin0311 / long-llms-learning
A repository sharing the literatures about long-context large language models, including the methodologies and the evaluation benchmarks
☆265Updated last year
llm-efficiency-challenge / neurips_llm_efficiency_challenge
NeurIPS Large Language Model Efficiency Challenge: 1 LLM + 1GPU + 1Day
☆256Updated last year
dwzhu-pku / PoSE
Positional Skip-wise Training for Efficient Context Window Extension of LLMs to Extremely Length (ICLR 2024)
☆205Updated last year
YuchuanTian / RethinkTinyLM
[ICML'24] The official implementation of “Rethinking Optimization and Architecture for Tiny Language Models”
☆122Updated 6 months ago
GeneZC / MiniMA
Code for paper titled "Towards the Law of Capacity Gap in Distilling Language Models"
☆101Updated last year
rasbt / cvpr2023
☆133Updated last year
liutiedong / goat
a Fine-tuned LLaMA that is Good at Arithmetic Tasks
☆178Updated last year
lucidrains / CALM-pytorch
Implementation of CALM from the paper "LLM Augmented LLMs: Expanding Capabilities through Composition", out of Google Deepmind
☆177Updated 10 months ago
dingo-actual / infini-transformer
PyTorch implementation of Infini-Transformer from "Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention…
☆290Updated last year
fangyuan-ksgk / Mini-LLaVA
A minimal implementation of LLaVA-style VLM with interleaved image & text & video processing ability.
☆94Updated 7 months ago
huggingface / picotron_tutorial
☆206Updated 5 months ago
pratyushasharma / laser
The Truth Is In There: Improving Reasoning in Language Models with Layer-Selective Rank Reduction
☆388Updated last year
wolfecameron / nanoMoE
An extension of the nanoGPT repository for training small MOE models.
☆164Updated 4 months ago
huggingface / datablations
Scaling Data-Constrained Language Models
☆338Updated last month
timinar / BabyLlama
Training code for Baby-Llama, our submission to the strict-small track of the BabyLM challenge.
☆81Updated last year
yuhuixu1993 / qa-lora
Official PyTorch implementation of QA-LoRA
☆138Updated last year
sangmichaelxie / cs324_p2
Project 2 (Building Large Language Models) for Stanford CS324: Understanding and Developing Large Language Models (Winter 2022)
☆105Updated 2 years ago
TIGER-AI-Lab / MAmmoTH
Code and data for "MAmmoTH: Building Math Generalist Models through Hybrid Instruction Tuning" [ICLR 2024]
☆376Updated 11 months ago
fangyuan-ksgk / Tiny-GRPO
minimal GRPO implementation from scratch
☆94Updated 4 months ago
yihedeng9 / rlhf-summary-notes
A brief and partial summary of RLHF algorithms.
☆131Updated 5 months ago
vwxyzjn / summarize_from_feedback_details
☆147Updated 8 months ago