sangmichaelxie / cs324_p2
Project 2 (Building Large Language Models) for Stanford CS324: Understanding and Developing Large Language Models (Winter 2022)
☆104Updated 2 years ago
Alternatives and similar repositories for cs324_p2:
Users that are interested in cs324_p2 are comparing it to the libraries listed below
- A puzzle to learn about prompting☆127Updated last year
- NeurIPS Large Language Model Efficiency Challenge: 1 LLM + 1GPU + 1Day☆255Updated last year
- ☆150Updated last year
- ☆267Updated 3 months ago
- Extract full next-token probabilities via language model APIs☆242Updated last year
- An interactive exploration of Transformer programming.☆263Updated last year
- Large language models (LLMs) made easy, EasyLM is a one stop solution for pre-training, finetuning, evaluating and serving LLMs in JAX/Fl…☆72Updated 8 months ago
- ☆49Updated last year
- Functional local implementations of main model parallelism approaches☆95Updated 2 years ago
- The GitHub repo for Goal Driven Discovery of Distributional Differences via Language Descriptions☆69Updated 2 years ago
- A MAD laboratory to improve AI architecture designs 🧪☆114Updated 4 months ago
- [NeurIPS 2023] Learning Transformer Programs☆161Updated 11 months ago
- Evaluating LLMs with CommonGen-Lite☆90Updated last year
- Python library which enables complex compositions of language models such as scratchpads, chain of thought, tool use, selection-inference…☆207Updated 3 months ago
- Code repository for the c-BTM paper☆106Updated last year
- A Collection of Competitive Text-Based Games for Language Model Evaluation and Reinforcement Learning☆150Updated last week
- Code associated to papers on superposition (in ML interpretability)☆27Updated 2 years ago
- I learn about and explain quantization☆26Updated last year
- The simplest implementation of recent Sparse Attention patterns for efficient LLM inference.☆60Updated 3 months ago
- Scaling Data-Constrained Language Models☆334Updated 7 months ago
- Simple and efficient pytorch-native transformer training and inference (batched)☆75Updated last year
- ☆36Updated 2 years ago
- Archon provides a modular framework for combining different inference-time techniques and LMs with just a JSON config file.☆172Updated 2 months ago
- ☆166Updated last year
- ModuleFormer is a MoE-based architecture that includes two different types of experts: stick-breaking attention heads and feedforward exp…☆220Updated last year
- ☆159Updated 2 years ago
- ☆130Updated last month
- Inference code for LLaMA models in JAX☆118Updated 11 months ago
- Tune efficiently any LLM model from HuggingFace using distributed training (multiple GPU) and DeepSpeed. Uses Ray AIR to orchestrate the …☆57Updated last year
- The simplest, fastest repository for training/finetuning medium-sized GPTs.☆114Updated this week