Infini-AI-Lab/Sirius

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/Infini-AI-Lab/Sirius)

Infini-AI-Lab / Sirius

Sirius, an efficient correction mechanism, which significantly boosts Contextual Sparsity models on reasoning tasks while maintaining its efficiency gain.

☆21

Alternatives and similar repositories for Sirius

Users that are interested in Sirius are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

abdelfattah-lab / shadow_llm
View on GitHub
☆11Sep 20, 2024Updated last year
WinnieHAN / structure_adv
View on GitHub
☆10Oct 28, 2020Updated 5 years ago
DS3Lab / Decentralized_FM_alpha
View on GitHub
☆18May 4, 2023Updated 3 years ago
facebookresearch / MemoryMosaics
View on GitHub
Memory Mosaics are networks of associative memories working in concert to achieve a prediction task.
☆63Jan 30, 2025Updated last year
Infini-AI-Lab / Sparrow
View on GitHub
☆16Jun 15, 2026Updated last month
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
WinnieHAN / mndmv
View on GitHub
☆12Mar 4, 2022Updated 4 years ago
ranggihwang / Pregated_MoE
View on GitHub
☆62May 4, 2024Updated 2 years ago
kssteven418 / BigLittleDecoder
View on GitHub
[NeurIPS'23] Speculative Decoding with Big Little Decoder
☆99Feb 6, 2024Updated 2 years ago
WentseChen / Soft-QMIX
View on GitHub
Soft-QMIX: Integrating Maximum Entropy For Monotonic Value Function Factorization
☆15Jul 3, 2024Updated 2 years ago
SqueezeAILab / open_source_projects
View on GitHub
Open Source Projects from Pallas Lab
☆21Oct 10, 2021Updated 4 years ago
RobertCsordas / switchhead
View on GitHub
☆16Jun 11, 2025Updated last year
zilongzheng / CoopNets
View on GitHub
Cooperative Learning of Energy-Based Model and Latent Variable Model via MCMC Teaching
☆27Aug 4, 2018Updated 7 years ago
NonvolatileMemory / flash_tree_attn
View on GitHub
☆20Dec 24, 2024Updated last year
vedantroy / gpu_kernels
View on GitHub
☆27Jan 8, 2024Updated 2 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
Infini-AI-Lab / gsm_infinite
View on GitHub
☆65Jun 12, 2025Updated last year
Infini-AI-Lab / MagicDec
View on GitHub
[ICLR2025] Breaking Throughput-Latency Trade-off for Long Sequences with Speculative Decoding
☆154Dec 4, 2024Updated last year
Infini-AI-Lab / vortex_torch
View on GitHub
Vortex: Programmable Sparse Attention for Agents as Algorithm Designers
☆68Jun 24, 2026Updated 3 weeks ago
Infini-AI-Lab / TriForce
View on GitHub
[COLM 2024] TriForce: Lossless Acceleration of Long Sequence Generation with Hierarchical Speculative Decoding
☆281Aug 31, 2024Updated last year
LiangrunFlora / Slow-Fast-Sampling
View on GitHub
Official PyTorch implementation of the paper "Accelerating Diffusion Large Language Models with SlowFast Sampling: The Three Golden Princ…
☆43Jul 18, 2025Updated last year
JackieTseng / DeepLearning-Application
View on GitHub
Training project about Deep Learing
☆12Jun 22, 2017Updated 9 years ago
PKU-SEC-Lab / AdapMoE
View on GitHub
Code release for AdapMoE accepted by ICCAD 2024
☆39Apr 28, 2025Updated last year
shoaibahmed / llm_depth_pruning
View on GitHub
Official implementation of the paper: "A deeper look at depth pruning of LLMs"
☆15Jul 24, 2024Updated last year
nanowell / Q-Sparse-LLM
View on GitHub
My Implementation of Q-Sparse: All Large Language Models can be Fully Sparsely-Activated
☆37Aug 14, 2024Updated last year
Open source password manager - Proton Pass • Ad
Securely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
Kaffaljidhmah2 / SpecDec_pp
View on GitHub
Repository for the COLM 2025 paper SpecDec++: Boosting Speculative Decoding via Adaptive Candidate Lengths
☆19Jul 10, 2025Updated last year
FMInference / DejaVu
View on GitHub
☆359Apr 2, 2024Updated 2 years ago
kssteven418 / SqueezeLLM-gradients
View on GitHub
☆21Feb 5, 2024Updated 2 years ago
jianwen-xie / 3DDescriptorNet
View on GitHub
Learning Descriptor Networks for 3D Shape Synthesis and Analysis
☆35Feb 15, 2022Updated 4 years ago
princeton-pli / LongProc
View on GitHub
LongProc: Benchmarking Long-Context Language Models on Long Procedural Generation
☆36Feb 26, 2026Updated 4 months ago
INK-USC / expl-refinement
View on GitHub
Code for the paper "Refining Language Model with Compositional Explanation" (NeurIPS 2021)
☆11Oct 25, 2021Updated 4 years ago
Infini-AI-Lab / MagicPIG
View on GitHub
[ICLR2025 Spotlight] MagicPIG: LSH Sampling for Efficient LLM Generation
☆255Dec 16, 2024Updated last year
radarFudan / mamba
View on GitHub
☆18Oct 26, 2024Updated last year
ryan-feng / GRAPHITE
View on GitHub
Code for GRAPHITE: Generating Automatic Physical Examples for Machine-Learning Attacks on Computer Vision Systems
☆12Apr 19, 2024Updated 2 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
dame-cell / Triformer
View on GitHub
Transformers components but in Triton
☆34May 9, 2025Updated last year
gigio1023 / LLMCompiler-Pro
View on GitHub
An extended project of the LLM Compiler paper, focusing on developing LLM-based Autonomous Agents.
☆26Oct 22, 2024Updated last year
zilongzheng / PatchGenCN
View on GitHub
CVPR 2021 Oral Paper PatchGenCN
☆11Oct 28, 2021Updated 4 years ago
brc7 / DiversitySampling
View on GitHub
☆10Jun 16, 2022Updated 4 years ago
welleast / Learning2Hash
View on GitHub
A Survey on Learning to Hash
☆10Apr 10, 2018Updated 8 years ago
haizhongzheng / LTE
View on GitHub
☆13Oct 13, 2025Updated 9 months ago
zhaoyanpeng / vpcfg
View on GitHub
Visually Grounded PCFG Induction
☆38May 18, 2022Updated 4 years ago