Stanford "Language Modeling from Scratch" CS336 Assignment1 - 斯坦福大学 CS336 课程作业1 个人实现,仅供参考
☆40Jun 15, 2025Updated 8 months ago
Alternatives and similar repositories for cs336-assignment1-basics
Users that are interested in cs336-assignment1-basics are comparing it to the libraries listed below
Sorting:
- Automate dating apps with AI☆19Jan 18, 2024Updated 2 years ago
- Code for the paper "Distinguishing the Knowable from the Unknowable with Language Models"☆11Apr 15, 2024Updated last year
- Exploring the minimal architecture required for coherent English language generation.☆12Mar 5, 2025Updated last year
- codes and plots for "Active-Dormant Attention Heads: Mechanistically Demystifying Extreme-Token Phenomena in LLMs"☆10Dec 30, 2024Updated last year
- 6,080-param transformer achieving 100% accuracy on 10-digit addition. Trained from scratch in 10 minutes.☆21Feb 19, 2026Updated 2 weeks ago
- Tutorials for MATH 4432 Statistical Machine Learning, HKUST, Fall 2022☆11Sep 17, 2024Updated last year
- Simple MoE - Day 17 of 365 Days of Repos☆17Jan 17, 2025Updated last year
- Pytorch routines for (Ker)nel (Mac)hines☆10Oct 10, 2025Updated 4 months ago
- A project designed to build and render a full Minecraft crafting tree.☆10Aug 10, 2021Updated 4 years ago
- Implementation codes for NeurIPS23 paper "Spectral Invariant Learning for Dynamic Graphs under Distribution Shifts"☆13Mar 19, 2024Updated last year
- A Jupyter-style custom node for executing Python code and plotting within ComfyUI workflows.☆35Dec 16, 2025Updated 2 months ago
- [ICLR 2025] This repository contains the code to reproduce the results from our paper From Sparse Dependence to Sparse Attention: Unveili…☆12Mar 7, 2025Updated 11 months ago
- C++-Animation-(Standard-Template-Library)-Engine,or CASTLE for short,is a C++ plotting and animation engine created by BiliBili uploader …☆11Jan 17, 2021Updated 5 years ago
- ☆11Jun 20, 2023Updated 2 years ago
- Diffusion-based Negative Sampling on Graphs for Link Prediction☆13Feb 13, 2024Updated 2 years ago
- WIP: Unnoficial implementation of diffusion autoencoders, using pytorch☆11Feb 15, 2023Updated 3 years ago
- ☆29Nov 30, 2025Updated 3 months ago
- [NeurIPS2024] CURE4Rec: A Benchmark for Recommendation Unlearning with Deeper Influence”☆20Jun 14, 2024Updated last year
- Code for ICLR 2023 Harnessing Out-Of-Distribution Examples via Augmenting Content and Style☆13Jul 3, 2023Updated 2 years ago
- A RL env with procedurally generated symbolic reasoning data☆34Updated this week
- This is the notebooks for videos in my Bilibili Channel (https://space.bilibili.com/32773300?spm_id_from=333.1007.0.0)☆30Nov 6, 2025Updated 3 months ago
- Code for Multi-Aspect Cross-modal Quantization for Generative Recommendation. (AAAI 2026 Oral)☆30Dec 9, 2025Updated 2 months ago
- A LLM-based Recommender System with user&item Tokenizers and a generative retrieval paradigm.☆24Aug 28, 2025Updated 6 months ago
- Student version of Assignment 1 for Stanford CS336 - Language Modeling From Scratch☆1,295Aug 29, 2025Updated 6 months ago
- My solution to assignments for Berkeley CS 285: Deep Reinforcement Learning, Decision Making, and Control.☆16Mar 19, 2025Updated 11 months ago
- ☆23Apr 16, 2024Updated last year
- 《计算机程序的构造和解释》(原书第二版)习题解答,在线阅读地址:https://relph1119.github.io/sicp-solutions-manual☆13May 28, 2021Updated 4 years ago
- Reproducing GPT on the TinyStories dataset☆19Jan 18, 2024Updated 2 years ago
- Computing the greatest common divisor with transformers, source code for the paper https//arxiv.org/abs/2308.15594☆14Aug 11, 2025Updated 6 months ago
- Curse-of-memory phenomenon of RNNs in sequence modelling☆19May 8, 2025Updated 9 months ago
- The official codes of Rethinking Knowledge Graph Evaluation Under the Open-World Assumption (NeurIPS 2022)☆22Sep 20, 2022Updated 3 years ago
- NeurIPS22 "RankFeat: Rank-1 Feature Removal for Out-of-distribution Detection" and T-PAMI Extension☆20Feb 21, 2025Updated last year
- ☆17Feb 4, 2025Updated last year
- UCB 285 Deep Reinforcement Learning (Fall 2023) Homeworks☆13Nov 11, 2023Updated 2 years ago
- source code of AAAI 2024 paper "Graph Invariant Learning with Subgraph Co-mixup for Out-Of-Distribution Generalization".☆18Apr 29, 2024Updated last year
- Complete self-learning materials of CS106L☆88Mar 31, 2024Updated last year
- Experiments for "A Closer Look at In-Context Learning under Distribution Shifts"☆19May 29, 2023Updated 2 years ago
- Datawhale开源教程 Bishop 深度学习理论和方法讲解☆35Jan 8, 2026Updated last month
- ☆25Sep 7, 2025Updated 5 months ago