Official implementation of the paper: "ZClip: Adaptive Spike Mitigation for LLM Pre-Training".
☆145Nov 20, 2025Updated 3 months ago
Alternatives and similar repositories for ZClip
Users that are interested in ZClip are comparing it to the libraries listed below
Sorting:
- open source alpha evolve☆68May 19, 2025Updated 9 months ago
- Recursive Leasting Squares (RLS) with Neural Network for fast learning☆58Nov 16, 2023Updated 2 years ago
- XmodelLM☆38Nov 19, 2024Updated last year
- Efficiently discovering algorithms via LLMs with evolutionary search and reinforcement learning.☆130Nov 18, 2025Updated 3 months ago
- finetune your florence2 model easy☆21Jul 27, 2024Updated last year
- Official repository of the IEEE SLT 2024 paper "Self-Supervised Syllable Discovery Based on Speaker-Disentangled HuBERT"☆45Updated this week
- Benchmarking Optimizers for LLM Pretraining☆54Dec 30, 2025Updated 2 months ago
- ☆24Oct 21, 2024Updated last year
- ☆11Oct 2, 2024Updated last year
- Python scripts for AI voice changers☆14Apr 25, 2023Updated 2 years ago
- 自分用のカスタムノード☆15Jun 16, 2025Updated 8 months ago
- Official implementation of Self-Taught Agentic Long Context Understanding (ACL 2025).☆12Sep 22, 2025Updated 5 months ago
- ☆13Jan 15, 2025Updated last year
- Single-step image generation at 306 FPS. Drifting vs Diffusion head-to-head on CIFAR-10.☆38Feb 13, 2026Updated 3 weeks ago
- Extract phoneme-level timestamps from speeh audio.☆119Feb 28, 2026Updated last week
- Simple implementation of muP, based on Spectral Condition for Feature Learning. The implementation is SGD only, dont use it for Adam☆86Jul 28, 2024Updated last year
- [ICML 2025] VistaDPO: Video Hierarchical Spatial-Temporal Direct Preference Optimization for Large Video Models☆39Jun 14, 2025Updated 8 months ago
- NextPGP is a elegant and powerful, modern online PGP tool built with Next.js. It can generate keys, manage keyrings, encrypt and decrypt …☆27Feb 7, 2026Updated last month
- Control LLM☆22Apr 6, 2025Updated 11 months ago
- A scalable implementation of diffusion and flow-matching with XGBoost models, applied to calorimeter data.☆19Nov 3, 2024Updated last year
- ☆19Jan 11, 2024Updated 2 years ago
- Official implementation of Categorical Flow Maps on text.☆46Feb 16, 2026Updated 3 weeks ago
- [CVPR 2026] DDT: Decoupled Diffusion Transformer☆367Aug 22, 2025Updated 6 months ago
- Tokenization across languages. Useful as preprocessing for subword tokenization.☆21Feb 7, 2023Updated 3 years ago
- ☆18Aug 21, 2025Updated 6 months ago
- [COLM 2025] "C3PO: Critical-Layer, Core-Expert, Collaborative Pathway Optimization for Test-Time Expert Re-Mixing"☆20Apr 9, 2025Updated 11 months ago
- Flash Attention Triton kernel with support for second-order derivatives☆146Feb 24, 2026Updated 2 weeks ago
- Custom ComfyUI nodes for downloading, converting, and previewing audio/video from YouTube and 1,000+ other platforms☆30Sep 6, 2025Updated 6 months ago
- To be a next-generation DL-based phenotype prediction from genome mutations.☆18May 17, 2021Updated 4 years ago
- ☆19Oct 2, 2023Updated 2 years ago
- Source code for our paper: "ARIA: Training Language Agents with Intention-Driven Reward Aggregation".☆26Aug 9, 2025Updated 7 months ago
- ☆34Jun 5, 2025Updated 9 months ago
- CNN ensemble for prostate cancer Gleason grading☆19Jan 28, 2026Updated last month
- A subset of PyTorch's neural network modules, written in Python using OpenAI's Triton.☆595Aug 12, 2025Updated 6 months ago
- FastCuRL: Curriculum Reinforcement Learning with Stage-wise Context Scaling for Efficient LLM Reasoning☆57Oct 10, 2025Updated 5 months ago
- [NeurIPS 25] The official implementation of SPC: Evolving Self-Play Critic via Adversarial Games for LLM Reasoning☆25Sep 21, 2025Updated 5 months ago
- Improving Your Model Ranking on Chatbot Arena by Vote Rigging (ICML 2025)☆26Feb 25, 2025Updated last year
- ☆585Sep 23, 2025Updated 5 months ago
- [COLING'25] Exploring Concept Depth: How Large Language Models Acquire Knowledge at Different Layers?☆83Jan 22, 2025Updated last year