Stanford "Language Modeling from Scratch" CS336 Assignment1 - 斯坦福大学 CS336 课程作业1 个人实现,仅供参考
☆44Jun 15, 2025Updated 10 months ago
Alternatives and similar repositories for cs336-assignment1-basics
Users that are interested in cs336-assignment1-basics are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [IEEE TKDE] A LLM-based Recommender System with user&item Tokenizers and a generative retrieval paradigm.☆26Mar 11, 2026Updated last month
- An agent with multiple CUHKSZ campus systems connected.☆17Dec 12, 2024Updated last year
- Code for the paper "Distinguishing the Knowable from the Unknowable with Language Models"☆11Apr 15, 2024Updated 2 years ago
- This repo contains the syllabus of the Hugging Face Deep Reinforcement Learning Course translated in Chinese.☆10Jan 16, 2024Updated 2 years ago
- Solution to kaggle competition OTTO – Multi-Objective Recommender System: https://www.kaggle.com/competitions/otto-recommender-system☆22Feb 2, 2023Updated 3 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Implementation based on pytorch for DIN recommendation algorithm☆21Jul 30, 2020Updated 5 years ago
- Automate dating apps with AI☆20Jan 18, 2024Updated 2 years ago
- codes and plots for "Active-Dormant Attention Heads: Mechanistically Demystifying Extreme-Token Phenomena in LLMs"☆10Dec 30, 2024Updated last year
- 手把手复现OpenVLA的中文说明☆96Dec 10, 2025Updated 4 months ago
- Procedural data generators suite for synthetic pretraining and formal reasoning☆36Updated this week
- Pytorch routines for (Ker)nel (Mac)hines☆12Oct 10, 2025Updated 6 months ago
- Simple MoE - Day 17 of 365 Days of Repos☆18Jan 17, 2025Updated last year
- 🐲 LLVM-based Kaleidoscope language compiler ✨ 基于 LLVM 的 Kaleidoscope 编译器☆12Dec 16, 2022Updated 3 years ago
- A project designed to build and render a full Minecraft crafting tree.☆10Aug 10, 2021Updated 4 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Exploring the minimal architecture required for coherent English language generation.☆12Mar 5, 2025Updated last year
- 6,080-param transformer achieving 100% accuracy on 10-digit addition. Trained from scratch in 10 minutes.☆22Feb 19, 2026Updated last month
- ☆11Mar 8, 2024Updated 2 years ago
- ☆11Jun 20, 2023Updated 2 years ago
- A Jupyter-style custom node for executing Python code and plotting within ComfyUI workflows.☆37Mar 18, 2026Updated 3 weeks ago
- Code for Multi-Aspect Cross-modal Quantization for Generative Recommendation. (AAAI 2026 Oral)☆37Dec 9, 2025Updated 4 months ago
- WIP: Unnoficial implementation of diffusion autoencoders, using pytorch☆11Feb 15, 2023Updated 3 years ago
- Reproducing GPT on the TinyStories dataset☆19Jan 18, 2024Updated 2 years ago
- NeurIPS22 "RankFeat: Rank-1 Feature Removal for Out-of-distribution Detection" and T-PAMI Extension☆20Feb 21, 2025Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- ☆14Jun 2, 2025Updated 10 months ago
- Tutorials for MATH 4432 Statistical Machine Learning, HKUST, Fall 2022☆11Sep 17, 2024Updated last year
- This is a repository for RM2021 Software tutorial☆11Nov 4, 2020Updated 5 years ago
- [ICLR 2025] This repository contains the code to reproduce the results from our paper From Sparse Dependence to Sparse Attention: Unveili…☆12Mar 7, 2025Updated last year
- Experiments for "A Closer Look at In-Context Learning under Distribution Shifts"☆19May 29, 2023Updated 2 years ago
- A template project to both illustrate and serve as an example for plugin creations on top of the manim.☆20Apr 30, 2021Updated 4 years ago
- Code and data release for CCS'2022 paper "Understanding IoT Security from a Market-Scale Perspective"☆12Apr 13, 2023Updated 3 years ago
- C++-Animation-(Standard-Template-Library)-Engine,or CASTLE for short,is a C++ plotting and animation engine created by BiliBili uploader …☆11Jan 17, 2021Updated 5 years ago
- Curse-of-memory phenomenon of RNNs in sequence modelling☆19May 8, 2025Updated 11 months ago
- Serverless GPU API endpoints on Runpod - Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- ☆19Dec 12, 2023Updated 2 years ago
- 《计算机程序的构造和解释》(原书第二版)习题解答,在线阅读地址:https://relph1119.github.io/sicp-solutions-manual☆13May 28, 2021Updated 4 years ago
- Computing the greatest common divisor with transformers, source code for the paper https//arxiv.org/abs/2308.15594☆14Aug 11, 2025Updated 8 months ago
- My solution to assignments for Berkeley CS 285: Deep Reinforcement Learning, Decision Making, and Control.☆17Mar 19, 2025Updated last year
- [NeurIPS 2023] "Unleashing the Power of Graph Data Augmentation on Covariate Distribution Shift" by Yongduo Sui, Qitian Wu, Jiancan Wu, Q…☆17Nov 6, 2023Updated 2 years ago
- UCB 285 Deep Reinforcement Learning (Fall 2023) Homeworks☆13Nov 11, 2023Updated 2 years ago
- 基于Javassist和asm实现的一 套AOP方案☆13May 14, 2020Updated 5 years ago