Master the essential steps of pretraining large language models (LLMs). Learn to create high-quality datasets, configure model architectures, execute training runs, and assess model performance for efficient and effective LLM pretraining.
☆26Aug 7, 2024Updated last year
Alternatives and similar repositories for Pretraining-LLMs
Users that are interested in Pretraining-LLMs are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Scripts of LLM pre-training and fine-tuning (w/wo LoRA, DeepSpeed)☆87Jan 30, 2024Updated 2 years ago
- Python implementation of the 15 puzzle game☆10Dec 30, 2016Updated 9 years ago
- This is the pipeline of our new article "Enzyme Co-Scientist: Harnessing Large Language Models for Enzyme Kinetic Data Extraction from Li…☆17May 23, 2025Updated 11 months ago
- ☆12Nov 12, 2025Updated 5 months ago
- ☆20Aug 14, 2025Updated 8 months ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Codes and data for KDD 2024 Research Track paper "ProCom: A Few-shot Targeted Community Detection Algorithm"☆11Aug 15, 2024Updated last year
- 🤯 Thoughts Lab 是基于 uni-app 开发的『工具集小程序』,采用通用型分层架构,支持多端发布、跨框架迁移 🤞☆16Dec 13, 2022Updated 3 years ago
- [🎖️1등(장관상) 솔루션] 2022 국립국어원 인공 지능 언어 능력 평가 (쇼핑몰 리뷰 데이터 속성 기반 감성 분석 : Aspect-Based Sentiment Analysis)☆11Jun 6, 2023Updated 2 years ago
- Toonification of real face images using PyTorch, Stylegan2 and Image-to-Image translation☆13Jun 14, 2022Updated 3 years ago
- Python wrapper around Yuta Mori's implementation of SA-IS suffix array construction.☆12Oct 26, 2012Updated 13 years ago
- ☆40Jun 14, 2025Updated 10 months ago
- ☆16Oct 14, 2025Updated 6 months ago
- Cell-type Assignment and Module Extraction based on a heterogeneous graph neural network.☆13Jan 27, 2024Updated 2 years ago
- Ollivier-Ricci Curvature for Hypergraphs: A Unified Framework (ICLR 2023)☆19Jun 14, 2023Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- [Paper][AAAI2023] Analogical Inference Enhanced Knowledge Graph Embedding☆13Jan 19, 2023Updated 3 years ago
- VT3D: a versatile Visualization Toolbox for 3D spatial transcriptomics atlas☆15Nov 6, 2024Updated last year
- 该存储库,主要存放根据自己所掌握的NGS差异表达分析技术方法为主,具体的实验以及方案设计等将不会存放在此存储库。☆15May 19, 2020Updated 5 years ago
- Calculate allowed interactions in QED☆10Nov 2, 2022Updated 3 years ago
- Code for "Whole brain alignment of spatial transcriptomics between humans and mice with BrainAlign"☆15Oct 4, 2024Updated last year
- Jupyter notebooks for course Finetuning Large Language Models, taught by Sharon Zhou (Lamini) and Andrew Ng (DeepLearning.AI).☆16Oct 21, 2023Updated 2 years ago
- Named-Entity-Recognition Workshop☆16May 27, 2019Updated 6 years ago
- Training PyTorch Faster-RCNN on custom dataset☆14Jun 2, 2021Updated 4 years ago
- My implementation of the 15 puzzle game using the Pygame module with Python!☆11Oct 7, 2020Updated 5 years ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Data Science & Machine Learning Project applied to Healthcare☆16Dec 1, 2021Updated 4 years ago
- Image Segmentation On Custom Dataset Using YOLOv8☆19Jan 12, 2023Updated 3 years ago
- Fine-Tuning Llama3-8B LLM in a multi-GPU environment using DeepSpeed☆21May 27, 2024Updated last year
- [NPJ AI] A foundation model for individual genome modelling☆25Apr 10, 2026Updated 3 weeks ago
- OMNI-P2x: A universal neural network potential for excited states☆13Mar 19, 2026Updated last month
- MLX port for xjdr's entropix sampler (mimics jax implementation)☆62Nov 4, 2024Updated last year
- Implementation of the WWW'23 paper "Toward Degree Bias in Embedding-Based Knowledge Graph Completion"☆15Jun 17, 2023Updated 2 years ago
- OPUS-Rota4: A Gradient-Based Protein Side-Chain Modeling Framework Assisted by Deep Learning-Based Predictors☆11Apr 14, 2022Updated 4 years ago
- Curvature Graph Network☆19Jul 1, 2020Updated 5 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Program to plot a Ramachandran plot of all dihedral angles from a given PDB file. Background is empirically generated from the peptides …☆13Feb 25, 2025Updated last year
- NanoGPT (124M) in 5 minutes☆15Feb 14, 2025Updated last year
- A n body simulation of our solar system completed in python☆10Dec 6, 2021Updated 4 years ago
- Minimal (truly) muP implementation, consistent with TP4 and TP5 papers notation☆14Jan 2, 2026Updated 4 months ago
- Code for the paper "GPTQ: Accurate Post-training Quantization of Generative Pretrained Transformers" with GPT-J implementation.☆15Mar 22, 2023Updated 3 years ago
- The official baseline implementations for Chronocept