Official PyTorch Implementation for Paper "No More Adam: Learning Rate Scaling at Initialization is All You Need"
☆56Jan 27, 2025Updated last year
Alternatives and similar repositories for SGD_SaI
Users that are interested in SGD_SaI are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆35Mar 12, 2025Updated last year
- Diffusion-Sharpening: Fine-tuning Diffusion Models with Denoising Trajectory Sharpening☆72May 18, 2025Updated 10 months ago
- Tuning-Free Image Editing with Fidelity and Editability via Unified Latent Diffusion Model☆13Dec 29, 2024Updated last year
- [ICLR 2025] Official Pytorch Implementation of "Mix-LN: Unleashing the Power of Deeper Layers by Combining Pre-LN and Post-LN" by Pengxia…☆29Jul 24, 2025Updated 8 months ago
- ☆11Sep 20, 2024Updated last year
- Serverless GPU API endpoints on Runpod - Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Switch EMA: A Free Lunch for Better Flatness and Sharpness☆28Feb 16, 2024Updated 2 years ago
- ☆25Dec 13, 2024Updated last year
- The official implementation of HybridNorm: Towards Stable and Efficient Transformer Training via Hybrid Normalization☆18Mar 7, 2025Updated last year
- Set-Encoder: Permutation-Invariant Inter-Passage Attention for Listwise Passage Re-Ranking with Cross-Encoders☆18May 23, 2025Updated 10 months ago
- Both Text and Images Leaked! A Systematic Analysis of Data Contamination in Multimodal LLM | EMNLP 2025 Findings☆18Oct 17, 2025Updated 5 months ago
- Legacy LoRA Trainer that work on T4 GPU Colab for SDXL Model☆23Oct 18, 2025Updated 5 months ago
- ☆19Jan 3, 2025Updated last year
- ☆15Oct 4, 2024Updated last year
- TEMPURA enables video-language models to reason about causal event relationships and generate fine-grained, timestamped descriptions of u…☆25Jun 4, 2025Updated 10 months ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- [ACM MM25] Official Pytorch implementation of [Decoupled Global-Local Alignment for Improving Compositional Understanding]☆16Jul 15, 2025Updated 9 months ago
- A pytorch realization of adafactor (https://arxiv.org/pdf/1804.04235.pdf )☆26Aug 27, 2019Updated 6 years ago
- Adaptive Blind All-in-One Image Restoration☆35Mar 17, 2025Updated last year
- Using short models to classify long texts☆21Mar 8, 2023Updated 3 years ago
- [ICCV 2025] Official implementation of "What Makes for Text to 360-degree Panorama Generation with Stable Diffusion?"☆19Aug 7, 2025Updated 8 months ago
- [ICLR 2025] Weighted-Reward Preference Optimization for Implicit Model Fusion☆14Mar 17, 2025Updated last year
- ☆15Sep 22, 2024Updated last year
- ☆15Mar 2, 2025Updated last year
- ☆19Oct 14, 2024Updated last year
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- ☆33Apr 22, 2025Updated 11 months ago
- Fork of Flame repo for training of some new stuff in development☆19Updated this week
- [CVPR 2025 Highlight] The official CLIP training codebase of Inf-CL: "Breaking the Memory Barrier: Near Infinite Batch Size Scaling for C…☆285Jan 16, 2025Updated last year
- ComfyUI Custom Node for "Golden Noise for Diffusion Models: A Learning Framework". This node refines the initial latent noise in the diff…☆23Mar 28, 2025Updated last year
- TPDiff: Temporal Pyramid Video Diffusion Model☆25Mar 13, 2025Updated last year
- [IJCAI 2024] Official implementation of the paper "Integrating View Conditions for Image Synthesis"☆25Aug 27, 2024Updated last year
- ☆19Jun 29, 2025Updated 9 months ago
- The AdEMAMix Optimizer: Better, Faster, Older.☆187Sep 12, 2024Updated last year
- [NeurIPS 2024] Exploring DCN-like Architectures for Fast Image Generation with Arbitrary Resolution☆35Dec 23, 2024Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- ☆15Apr 6, 2026Updated last week
- ☆13Apr 1, 2026Updated 2 weeks ago
- A1111, ComfyUI, Forge, Forge-Classic/Neo, ReForge, SD-UX - One NoteBook for Google Colab & Kaggle☆35Apr 4, 2026Updated last week
- Pretraining and finetuning for visual instruction following with Mixture of Experts☆15Jan 30, 2024Updated 2 years ago
- Mimic Intent, Not Just Trajectories (MINT) official implementation☆30Apr 6, 2026Updated last week
- MUFFIN: Curating Multi-Faceted Instructions for Improving Instruction-Following☆16Oct 31, 2024Updated last year
- Triton Implementation of HyperAttention Algorithm☆48Dec 11, 2023Updated 2 years ago