Low memory full parameter finetuning of LLMs
☆54Jul 18, 2025Updated 10 months ago
Alternatives and similar repositories for lowmem_finetuning
Users that are interested in lowmem_finetuning are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Marketplace ML experiment - training without backprop☆27Sep 9, 2025Updated 9 months ago
- Project code for training LLMs to write better unit tests + code☆22May 19, 2025Updated last year
- Implementation of <Model Merging with Functional Dual Anchors>☆47Nov 23, 2025Updated 6 months ago
- TPU support for the fastai library☆14Apr 15, 2021Updated 5 years ago
- ExpertFingerprinting: Behavioral Pattern Analysis and Specialization Mapping of Experts in GPT-OSS-20B's Mixture-of-Experts Architecture☆27Feb 3, 2026Updated 4 months ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Efficient implementations of state-of-the-art linear attention models in Pytorch and Triton☆48Apr 2, 2026Updated 2 months ago
- Fine-Tuning Llama3-8B LLM in a multi-GPU environment using DeepSpeed☆21May 27, 2024Updated 2 years ago
- Convert MathML to Latex for OneNote to Markdown☆13Mar 17, 2026Updated 2 months ago
- WiDS Datathon 2020 Second place solution☆10Jul 6, 2023Updated 2 years ago
- Codebase from our first release.☆58Feb 17, 2026Updated 3 months ago
- Aulas de conceitos básicos de Processamento de Linguagem Natural oferecida no Discord aberto no Turing USP☆10Jul 30, 2021Updated 4 years ago
- The elegant integration of huggingface/nlp and fastai2 and handy transforms using pure huggingface/nlp☆19Oct 6, 2020Updated 5 years ago
- ☆46Mar 31, 2026Updated 2 months ago
- we have ai at home☆112May 13, 2026Updated 3 weeks ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- A Very Simple Demo of Fine Tuning Sentence Transformers☆15Jun 15, 2023Updated 2 years ago
- Dynamic batching for Document Layout and OCR, suitable for RAG, with extra tools.☆14Nov 25, 2024Updated last year
- Tiny evaluation of leading LLMs on competitive programming problems☆14Apr 10, 2026Updated last month
- ☆13May 30, 2019Updated 7 years ago
- Coding with ChatGPT and other LLMs, published by Packt☆16Dec 9, 2024Updated last year
- Code repository for "RL Grokking Recipe: How RL Unlocks and Transfers New Algorithms in LLMs""☆35Oct 12, 2025Updated 7 months ago
- An agent for CUDA compute-communication kernel co-design☆35May 7, 2026Updated last month
- Adaptive Resonance Theory models☆16May 12, 2017Updated 9 years ago
- ☆22Jan 29, 2026Updated 4 months ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- This course is published by Packt Publishing☆23Aug 2, 2023Updated 2 years ago
- Raw bindings to platform APIs for OCaml☆16Mar 18, 2024Updated 2 years ago
- This repository contains code that was used to train and evaluate deep learning models, as described in the article "Improving breast can…☆16Aug 13, 2022Updated 3 years ago
- Proposed plumbing commands for cargo☆25Jun 1, 2026Updated last week
- ☆47Sep 15, 2025Updated 8 months ago
- Inference code for LLaMA models☆21Apr 3, 2025Updated last year
- Leveraging☆13Dec 7, 2023Updated 2 years ago
- Verifiers for LLM Reinforcement Learning☆80Apr 15, 2025Updated last year
- Get insights from your research papers with LlamaExtract☆29Aug 8, 2025Updated 10 months ago
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- Scripts for training Qwen 2.5 VL with ms-swift and GRPO☆12Feb 27, 2025Updated last year
- This project develops compact transformer models tailored for clinical text analysis, balancing efficiency and performance for healthcare…☆18Mar 26, 2024Updated 2 years ago
- A Test Collection of Computer Science Papers for Faceted Query by Example☆23Nov 28, 2021Updated 4 years ago
- Nexusflow function call, tool use, and agent benchmarks.☆30Dec 13, 2024Updated last year
- Charlson Comorbidity Index Regression using Clinical Notes☆10Jul 26, 2018Updated 7 years ago
- notebooks of cool EBM visualizations☆15Feb 12, 2021Updated 5 years ago
- Training tiny models to prove hard theorems☆77Mar 5, 2026Updated 3 months ago