pp1230 / LLMGPUMemEstimator
The GPU RAM Estimator provides a simple tool for estimating GPU memory usage during training and inference.
☆26Updated 7 months ago
Related projects ⓘ
Alternatives and complementary repositories for LLMGPUMemEstimator
- Llama-3-SynE: A Significantly Enhanced Version of Llama-3 with Advanced Scientific Reasoning and Chinese Language Capabilities | 继续预训练提升 …☆27Updated 3 months ago
- ☆40Updated 5 months ago
- 怎么训练一个LLM分词器☆130Updated last year
- ☆82Updated last year
- Dataset and evaluation script for "Evaluating Hallucinations in Chinese Large Language Models"☆109Updated 5 months ago
- ☆158Updated last year
- Train llm (bloom, llama, baichuan2-7b, chatglm3-6b) with deepspeed pipeline mode. Faster than zero/zero++/fsdp.☆90Updated 9 months ago
- ☆119Updated 9 months ago
- Official Repository for SIGIR2024 Demo Paper "An Integrated Data Processing Framework for Pretraining Foundation Models"☆56Updated 2 months ago
- A prototype repo for hybrid training of pipeline parallel and distributed data parallel with comments on core code snippets. Feel free to…☆49Updated last year
- Code for a New Loss for Mitigating the Bias of Learning Difficulties in Generative Language Models☆53Updated 5 months ago
- Repository of LV-Eval Benchmark☆50Updated 2 months ago
- This is a personal reimplementation of Google's Infini-transformer, utilizing a small 2b model. The project includes both model and train…☆52Updated 7 months ago
- Clustering and Ranking: Diversity-preserved Instruction Selection through Expert-aligned Quality Estimation☆67Updated last week
- Ongoing research training transformer language models at scale, including: BERT & GPT-2☆69Updated last year
- ☆93Updated 8 months ago
- 基于DPO算法微调语言大模型,简单好上手。☆28Updated 4 months ago
- 欢迎来到 "LLM-travel" 仓库!探索大语言模型(LLM)的奥秘 🚀。致力于深入理解、探讨以及实现与大模型相关的各种技术、原理和应用。☆269Updated 4 months ago
- 百川Dynamic NTK-ALiBi的代码实现:无需微调即可推理更长文本☆46Updated last year
- CLongEval: A Chinese Benchmark for Evaluating Long-Context Large Language Models☆38Updated 8 months ago
- ☆120Updated 7 months ago
- [ACL'24] Superfiltering: Weak-to-Strong Data Filtering for Fast Instruction-Tuning☆126Updated 2 months ago
- [ACL 2024] Long-Context Language Modeling with Parallel Encodings☆146Updated 5 months ago
- ☆41Updated 3 months ago
- 1.4B sLLM for Chinese and English - HammerLLM🔨☆43Updated 7 months ago
- pytorch分布式训练☆59Updated last year
- 一些 LLM 方面的从零复现笔记☆138Updated 2 months ago
- ☆85Updated 2 weeks ago
- ☆88Updated 4 months ago
- ChatGPT相关资源汇总☆53Updated last year