stanford-cs336 / spring2024-assignment1-basics
☆30Updated 8 months ago
Alternatives and similar repositories for spring2024-assignment1-basics:
Users that are interested in spring2024-assignment1-basics are comparing it to the libraries listed below
- ☆65Updated last month
- ☆48Updated last year
- Understanding how features learned by neural networks evolve throughout training☆33Updated 5 months ago
- Sparse and discrete interpretability tool for neural networks☆59Updated last year
- Large language models (LLMs) made easy, EasyLM is a one stop solution for pre-training, finetuning, evaluating and serving LLMs in JAX/Fl…☆71Updated 7 months ago
- Codes and files for the paper Are Emergent Abilities in Large Language Models just In-Context Learning☆33Updated 2 months ago
- gzip Predicts Data-dependent Scaling Laws☆34Updated 9 months ago
- A puzzle to learn about prompting☆124Updated last year
- Project 2 (Building Large Language Models) for Stanford CS324: Understanding and Developing Large Language Models (Winter 2022)☆102Updated 2 years ago
- PyTorch library for Active Fine-Tuning☆61Updated last month
- ☆20Updated 11 months ago
- A reading list of relevant papers and projects on foundation model annotation☆25Updated 3 weeks ago
- Collection of autoregressive model implementation☆83Updated last month
- Simple and efficient pytorch-native transformer training and inference (batched)☆71Updated 11 months ago
- Large scale 4D parallelism pre-training for 🤗 transformers in Mixture of Experts *(still work in progress)*☆81Updated last year
- Repository for the code and dataset for the paper: "Have LLMs Advanced enough? Towards Harder Problem Solving Benchmarks For Large Langu…☆39Updated last year
- Dataset and evaluation suite enabling LLM instruction-following for scientific literature understanding.☆37Updated last week
- ☆121Updated last year
- ReBase: Training Task Experts through Retrieval Based Distillation☆28Updated last month
- ☆26Updated last year
- Repository for "I am a Strange Dataset: Metalinguistic Tests for Language Models"☆41Updated last year
- ☆37Updated 11 months ago
- Discovering Data-driven Hypotheses in the Wild☆65Updated 4 months ago
- ☆51Updated 10 months ago
- Code for NeurIPS 2024 Spotlight: "Scaling Laws and Compute-Optimal Training Beyond Fixed Training Durations"☆71Updated 4 months ago
- A place to store reusable transformer components of my own creation or found on the interwebs☆48Updated this week
- Extract full next-token probabilities via language model APIs☆237Updated last year
- Utilities for Training Very Large Models☆58Updated 6 months ago
- See https://github.com/cuda-mode/triton-index/ instead!☆11Updated 10 months ago
- Minimal (400 LOC) implementation Maximum (multi-node, FSDP) GPT training☆122Updated 11 months ago