stanford-cs336 / spring2024-lecturesView external linksLinks
☆413Dec 26, 2024Updated last year
Alternatives and similar repositories for spring2024-lectures
Users that are interested in spring2024-lectures are comparing it to the libraries listed below
Sorting:
- ☆18May 3, 2024Updated last year
- ☆73Jul 13, 2024Updated last year
- ☆22Apr 22, 2024Updated last year
- Image Tokenizer Needs Post-Training☆24Oct 4, 2025Updated 4 months ago
- ☆2,583Jan 9, 2026Updated last month
- Vision Large Language Models trained on M3IT instruction tuning dataset☆17Aug 16, 2023Updated 2 years ago
- Open-source framework for the research and development of foundation models.☆758Updated this week
- ☆291Jul 15, 2024Updated last year
- A scalable asynchronous reinforcement learning implementation with in-flight weight updates.☆362Updated this week
- Minimalistic large language model 3D-parallelism training☆2,544Dec 11, 2025Updated 2 months ago
- ☆15Jan 21, 2026Updated 3 weeks ago
- ☆26Sep 22, 2025Updated 4 months ago
- [CVPR'25 - Rating 555] Official PyTorch implementation of Lumos: Learning Visual Generative Priors without Text☆53Mar 16, 2025Updated 11 months ago
- ☆10Jul 8, 2025Updated 7 months ago
- ☆13May 9, 2024Updated last year
- ☆13Sep 2, 2023Updated 2 years ago
- Code for "SCL-RAI: Span-based Contrastive Learning with Retrieval Augmented Inference for Unlabeled Entity Problem in NER" @COLING-2022☆11Aug 20, 2022Updated 3 years ago
- Typed python equivalent for R pipes.☆13Oct 16, 2022Updated 3 years ago
- Mixture of Expert (MoE) techniques for enhancing LLM performance through expert-driven prompt mapping and adapter combinations.☆12Feb 11, 2024Updated 2 years ago
- ☆12Aug 26, 2025Updated 5 months ago
- Official eval code for ROVER: Benchmarking Reciprocal Cross-Modal Reasoning for Omnimodal Generation☆27Dec 12, 2025Updated 2 months ago
- ☆12Jul 6, 2023Updated 2 years ago
- Code for the paper "Interpreting and Improving Diffusion Models from an Optimization Perspective", appearing in ICML 2024☆14Sep 30, 2024Updated last year
- Physics of Language Models: Part 4.2, Canon Layers at Scale where Synthetic Pretraining Resonates in Reality☆317Jan 5, 2026Updated last month
- 🚀 Efficient implementations of state-of-the-art linear attention models☆4,379Updated this week
- Fast and memory-efficient exact attention☆22,231Updated this week
- Group Meeting Record for Baobao Chang Group in Peking University☆26May 17, 2021Updated 4 years ago
- ☆34Jun 21, 2023Updated 2 years ago
- The code for our NeurIPS 2021 paper "Kernelized Heterogeneous Risk Minimization".☆13Oct 13, 2021Updated 4 years ago
- ☆14May 4, 2025Updated 9 months ago
- ☆16Apr 9, 2025Updated 10 months ago
- MISO: Learning Multiple Initial Solutions to Optimization Problems☆16Nov 8, 2024Updated last year
- The supplementary material for the paper "Fine-tuning Large Language Models to Improve Accuracy and Comprehensibility of Automated Code R…☆16Aug 12, 2024Updated last year
- Official codebase for Adaptive Online Planning for Continual Lifelong Learning.☆16Mar 26, 2020Updated 5 years ago
- My learning notes for ML SYS.☆5,306Jan 30, 2026Updated 2 weeks ago
- GPU programming related news and material links☆1,967Sep 17, 2025Updated 4 months ago
- What would you do with 1000 H100s...☆1,153Jan 10, 2024Updated 2 years ago
- [ICDCS 2023] DeAR: Accelerating Distributed Deep Learning with Fine-Grained All-Reduce Pipelining☆12Dec 4, 2023Updated 2 years ago
- Extending context length of visual language models☆12Dec 18, 2024Updated last year