sangmichaelxie / cs324_p2Links

Project 2 (Building Large Language Models) for Stanford CS324: Understanding and Developing Large Language Models (Winter 2022)

☆105

Alternatives and similar repositories for cs324_p2

Users that are interested in cs324_p2 are comparing it to the libraries listed below

Sorting:

llm-efficiency-challenge / neurips_llm_efficiency_challenge
NeurIPS Large Language Model Efficiency Challenge: 1 LLM + 1GPU + 1Day
☆256Updated last year
srush / GPTWorld
A puzzle to learn about prompting
☆132Updated 2 years ago
stanford-crfm / ecosystem-graphs
☆267Updated 6 months ago
hundredblocks / large-model-parallelism
Functional local implementations of main model parallelism approaches
☆95Updated 2 years ago
allenai / fm-cheatsheet
Website for hosting the Open Foundation Models Cheat Sheet.
☆267Updated 3 months ago
IBM / ModuleFormer
ModuleFormer is a MoE-based architecture that includes two different types of experts: stick-breaking attention heads and feedforward exp…
☆223Updated last year
normster / llm_rules
RuLES: a benchmark for evaluating rule-following in language models
☆228Updated 5 months ago
huggingface / datablations
Scaling Data-Constrained Language Models
☆338Updated last month
SumanthRH / tokenization
A comprehensive deep dive into the world of tokens
☆225Updated last year
neulab / gemini-benchmark
☆149Updated last year
srush / do-we-need-attention
☆166Updated 2 years ago
xrsrke / pipegoose
Large scale 4D parallelism pre-training for 🤗 transformers in Mixture of Experts *(still work in progress)*
☆86Updated last year
tcapelle / llm_recipes
A set of scripts and notebooks on LLM finetunning and dataset creation
☆110Updated 10 months ago
geronimi73 / phi2-finetune
☆87Updated last year
mcleish7 / arithmetic
Code to reproduce "Transformers Can Do Arithmetic with the Right Embeddings", McLeish et al (NeurIPS 2024)
☆190Updated last year
srush / raspy
An interactive exploration of Transformer programming.
☆267Updated last year
hamelsmu / llama-inference
experiments with inference on llama
☆104Updated last year
stas00 / ml-ways
ML/DL Math and Method notes
☆62Updated last year
princeton-nlp / TransformerPrograms
[NeurIPS 2023] Learning Transformer Programs
☆162Updated last year
huggingface / llm-swarm
Manage scalable open LLM inference endpoints in Slurm clusters
☆268Updated last year
sabetAI / BLoRA
batched loras
☆344Updated last year
kernelmachine / cbtm
Code repository for the c-BTM paper
☆107Updated last year
rasbt / pytorch-memory-optim
This code repository contains the code used for my "Optimizing Memory Usage for Training LLMs and Vision Transformers in PyTorch" blog po…
☆92Updated 2 years ago
allenai / CommonGen-Eval
Evaluating LLMs with CommonGen-Lite
☆90Updated last year
felipemaiapolo / tinyBenchmarks
Evaluating LLMs with fewer examples
☆160Updated last year
Sea-Snell / JAX_llama
Inference code for LLaMA models in JAX
☆118Updated last year
gautierdag / bpeasy
Fast bare-bones BPE for modern tokenizer training
☆164Updated last month
justinchiu / openlogprobs
Extract full next-token probabilities via language model APIs
☆247Updated last year
abacaj / train-with-fsdp
☆93Updated last year
lukasberglund / reversal_curse
☆291Updated last year