EasternJournalist / learn-deep-learningLinks
Labs for deep learning course.
☆16Updated 4 years ago
Alternatives and similar repositories for learn-deep-learning
Users that are interested in learn-deep-learning are comparing it to the libraries listed below
Sorting:
- This is a personal reimplementation of Google's Infini-transformer, utilizing a small 2b model. The project includes both model and train…☆58Updated last year
- 基于Gated Attention Unit的Transformer模型(尝鲜版)☆98Updated 2 years ago
- RLHF experiments on a single A100 40G GPU. Support PPO, GRPO, REINFORCE, RAFT, RLOO, ReMax, DeepSeek R1-Zero reproducing.☆79Updated 11 months ago
- an implementation of transformer, bert, gpt, and diffusion models for learning purposes☆160Updated last year
- Official implementation for NAACL 2024 paper "HILL: Hierarchy-aware Information Lossless Contrastive Learning for Hierarchical Text Class…☆19Updated last year
- [ICLR 2024]EMO: Earth Mover Distance Optimization for Auto-Regressive Language Modeling(https://arxiv.org/abs/2310.04691)☆128Updated last year
- ☆45Updated 2 months ago
- Feeling confused about super alignment? Here is a reading list☆44Updated 2 years ago
- ☆42Updated 11 months ago
- Notes of my introduction about NLP in Fudan University☆37Updated 4 years ago
- dpo算法实现☆50Updated last year
- ☆109Updated 6 months ago
- Tips for paper writing and researches 科 技论文写作经验记录和总结☆139Updated 4 years ago
- This is a repo for showcasing using MCTS with LLMs to solve gsm8k problems☆94Updated 2 months ago
- Must-read papers on improving efficiency for pre-trained language models.☆105Updated 3 years ago
- SimCSE☆15Updated 3 years ago
- ☆125Updated last year
- A Tight-fisted Optimizer (Tiger), implemented in PyTorch.☆12Updated last year
- A pre-trained model with multi-exit transformer architecture.☆56Updated 3 years ago
- 《自然语言处理概论》 张奇、桂韬、黄萱菁著☆121Updated 2 years ago
- A curated reading list for large language model (LLM) alignment. Take a look at our new survey "Large Language Model Alignment: A Survey"…☆80Updated 2 years ago
- Research without Re-search: Maximal Update Parametrization Yields Accurate Loss Prediction across Scales☆32Updated 2 years ago
- Official completion of “Training on the Benchmark Is Not All You Need”.☆39Updated last year
- Implementation of "ACL'24: When Do LLMs Need Retrieval Augmentation? Mitigating LLMs’ Overconfidence Helps Retrieval Augmentation"☆24Updated last year
- [ICLR 2025] MiniPLM: Knowledge Distillation for Pre-Training Language Models☆73Updated last year
- Lion and Adam optimization comparison☆64Updated 2 years ago
- Summarize all open source Large Languages Models and low-cost replication methods for Chatgpt.☆136Updated 2 years ago
- Fantastic Data Engineering for Large Language Models☆93Updated last year
- ☆18Updated 3 years ago
- Tool for converting LLMs from uni-directional to bi-directional by removing causal mask for tasks like classification and sentence embedd…☆63Updated last year