Red-Hat-AI-Innovation-Team / SQuatLinks

☆16

Alternatives and similar repositories for SQuat

Users that are interested in SQuat are comparing it to the libraries listed below

Sorting:

SalesforceAIResearch / GemFilter
☆80Updated 6 months ago
RobertCsordas / moeut
☆82Updated 10 months ago
sail-sg / VeriFree
Reinforcing General Reasoning without Verifiers
☆71Updated 3 weeks ago
IST-DASLab / QuEST
Work in progress.
☆70Updated 2 weeks ago
fangyuan-ksgk / selective-attention-transformer
Unofficial Implementation of Selective Attention Transformer
☆17Updated 8 months ago
ScalingIntelligence / large_language_monkeys
☆97Updated 9 months ago
sail-sg / SkyLadder
The official repository for SkyLadder: Better and Faster Pretraining via Context Window Scheduling
☆33Updated 3 months ago
BaohaoLiao / RSD
[ICML 2025] Reward-guided Speculative Decoding (RSD) for efficiency and effectiveness.
☆35Updated 2 months ago
zqOuO / GWT
☆13Updated 6 months ago
NathanGodey / qfilters
Repository for the Q-Filters method (https://arxiv.org/pdf/2503.02812)
☆33Updated 4 months ago
BorealisAI / flora-opt
This is the official repository for the paper "Flora: Low-Rank Adapters Are Secretly Gradient Compressors" in ICML 2024.
☆104Updated last year
xufangzhi / phi-Decoding
[ACL 2025] An inference-time decoding strategy with adaptive foresight sampling
☆99Updated last month
OpenEvaByte / evabyte
EvaByte: Efficient Byte-level Language Models at Scale
☆103Updated 2 months ago
VITA-Group / WeLore
From GaLore to WeLore: How Low-Rank Weights Non-uniformly Emerge from Low-Rank Gradients. Ajay Jaiswal, Lu Yin, Zhenyu Zhang, Shiwei Liu,…
☆47Updated 2 months ago
efficientscaling / Z1
Repo for "Z1: Efficient Test-time Scaling with Code"
☆63Updated 3 months ago
haonan3 / AnchorContext
AnchorAttention: Improved attention for LLMs long-context training
☆208Updated 6 months ago
PiotrNawrot / sparse-frontier
The evaluation framework for training-free sparse attention in LLMs
☆83Updated 3 weeks ago
scitix / MEAP
Mask-Enhanced Autoregressive Prediction: Pay Less Attention to Learn More
☆31Updated 2 months ago
complex-reasoning / RPG
The official implementation of Regularized Policy Gradient (RPG) (https://arxiv.org/abs/2505.17508)
☆35Updated last week
jeffreysijuntan / lloco
The official repo for "LLoCo: Learning Long Contexts Offline"
☆117Updated last year
SalesforceAIResearch / Elastic-Reasoning
Make reasoning models scalable
☆40Updated last month
Infini-AI-Lab / Multiverse
☆71Updated last week
JacobPfau / fillerTokens
☆66Updated last year
OpenMOSS / Lorsa
☆23Updated 3 weeks ago
SalesforceAIResearch / LaTRO
☆117Updated 4 months ago
hughbzhang / o1_inference_scaling_laws
Replicating O1 inference-time scaling laws
☆89Updated 7 months ago
DeepAuto-AI / hip-attention
Training-free Post-training Efficient Sub-quadratic Complexity Attention. Implemented with OpenAI Triton.
☆139Updated this week
vicksEmmanuel / latent-gemma
☆26Updated 6 months ago
ryoungj / BoLT
Code for "Reasoning to Learn from Latent Thoughts"
☆112Updated 3 months ago
LLM360 / Reasoning360
A repo for open research on building large reasoning models
☆71Updated this week