A Tight-fisted Optimizer
☆50Mar 7, 2023Updated 2 years ago
Alternatives and similar repositories for tiger
Users that are interested in tiger are comparing it to the libraries listed below
Sorting:
- A Tight-fisted Optimizer (Tiger), implemented in PyTorch.☆12Jun 26, 2024Updated last year
- Lion and Adam optimization comparison☆64Feb 23, 2023Updated 3 years ago
- 一些RNN的实现☆52Mar 29, 2023Updated 2 years ago
- An external memory allocator example for PyTorch.☆16Aug 10, 2025Updated 6 months ago
- 基于Gated Attention Unit的Transformer模型(尝鲜版)☆98Feb 24, 2023Updated 3 years ago
- ☆11Jan 18, 2024Updated 2 years ago
- Depict GPU memory footprint during DNN training of PyTorch☆11Nov 17, 2022Updated 3 years ago
- ☆25Jun 24, 2021Updated 4 years ago
- A Simple Adaptive Unfolding Network for Hyperspectral Image Reconstruction☆32Feb 1, 2023Updated 3 years ago
- Rectified Rotary Position Embeddings☆389May 20, 2024Updated last year
- The Bytepiece Tokenizer Implemented in Rust.☆14Nov 28, 2023Updated 2 years ago
- Pytorch implementation for "Compressed Context Memory For Online Language Model Interaction" (ICLR'24)☆63Apr 18, 2024Updated last year
- Stable Diffusion V1.5 Inference With PyTorch Weights And More Features Like Stable Diffusion Web UI In Keras 3.x☆16May 28, 2025Updated 9 months ago
- GAU-alpha-pytorch☆20May 11, 2022Updated 3 years ago
- SOIT: Segmenting Objects with Instance-Aware Transformers☆14Jun 6, 2022Updated 3 years ago
- ☆19May 27, 2023Updated 2 years ago
- [CVPR 2023] RILS: Masked Visual Reconstruction in Language Semantic Space (https://arxiv.org/abs/2301.06958)☆44Sep 5, 2023Updated 2 years ago
- ☆17Nov 23, 2021Updated 4 years ago
- OneFlow Serving☆21Apr 10, 2025Updated 10 months ago
- ☆20Nov 3, 2024Updated last year
- [NeurIPS 2024] | An Efficient Recipe for Long Context Extension via Middle-Focused Positional Encoding☆22Oct 10, 2024Updated last year
- ☆40Feb 14, 2023Updated 3 years ago
- adafactor optimizer for keras☆20Aug 19, 2021Updated 4 years ago
- ☆40Oct 24, 2023Updated 2 years ago
- Paper List for In-context Learning 🌷☆20Jan 3, 2023Updated 3 years ago
- Qwen-WisdomVast is a large model trained on 1 million high-quality Chinese multi-turn SFT data, 200,000 English multi-turn SFT data, and …☆18Apr 12, 2024Updated last year
- ☆136May 29, 2025Updated 9 months ago
- ☆23Apr 25, 2023Updated 2 years ago
- Distributed DataLoader For Pytorch Based On Ray☆25Nov 5, 2021Updated 4 years ago
- The official GitHub page for paper "NegativePrompt: Leveraging Psychology for Large Language Models Enhancement via Negative Emotional St…☆25May 10, 2024Updated last year
- AziNorm: Exploiting the Radial Symmetry of Point Cloud for Azimuth-Normalized 3D Perception, CVPR 2022.☆54Sep 15, 2022Updated 3 years ago
- Source code for GreaTer ICLR 2025 - Gradient Over Reasoning makes Smaller Language Models Strong Prompt Optimizers☆35Apr 18, 2025Updated 10 months ago
- TVMScript kernel for deformable attention☆25Dec 15, 2021Updated 4 years ago
- ☆31Mar 23, 2024Updated last year
- Teach-DETR: Better Training DETR with Teachers☆31Mar 18, 2024Updated last year
- Examples for MS-AMP package.☆30Jul 17, 2025Updated 7 months ago
- Accurately and reliably defining organs at risk (OARs) and tumors are the cornerstone of radiation therapy (RT) treatment planning for lu…☆12Jul 19, 2023Updated 2 years ago
- ☆33Oct 9, 2022Updated 3 years ago
- Simple large-scale training of stable diffusion with multi-node support.☆133May 8, 2023Updated 2 years ago