☆29Mar 24, 2025Updated last year
Alternatives and similar repositories for FastTree-Artifact
Users that are interested in FastTree-Artifact are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A recommendation model kernel optimizing system☆12Jun 5, 2025Updated 10 months ago
- An Optimizing Compiler for Recommendation Model Inference☆26Jun 5, 2025Updated 10 months ago
- [ICLR 2025] DeFT: Decoding with Flash Tree-attention for Efficient Tree-structured LLM Inference☆51Jun 17, 2025Updated 10 months ago
- 此项目是我个人对MIT 6.5940 课程作业的答案,学习笔记和心得。☆15Mar 1, 2024Updated 2 years ago
- ☆21Oct 21, 2024Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- A Parallel Secure Machine Learning Framework on GPUs☆21Nov 17, 2021Updated 4 years ago
- Dynamic Memory Management for Serving LLMs without PagedAttention☆482May 30, 2025Updated 11 months ago
- ☆32Jul 17, 2024Updated last year
- ☆20Dec 24, 2024Updated last year
- Some quick and dirty Postgres benchmarks☆15Feb 27, 2022Updated 4 years ago
- ☆13Nov 6, 2021Updated 4 years ago
- ☆85Apr 18, 2025Updated last year
- Sparse kernels for GNNs based on TVM☆17Nov 18, 2020Updated 5 years ago
- ☆21Jul 24, 2025Updated 9 months ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- FlashSparse significantly reduces the computation redundancy for unstructured sparsity (for SpMM and SDDMM) on Tensor Cores through a Swa…☆39Oct 5, 2025Updated 6 months ago
- ☆28Oct 11, 2022Updated 3 years ago
- ☆34Mar 31, 2025Updated last year
- Short RL☆18Apr 16, 2026Updated 2 weeks ago
- A benchmark suite for Scalable Diverse Model Selection for Accessible Transfer Learning from our NeurIPS 2021 paper.☆15Dec 14, 2022Updated 3 years ago
- Flash-LLM: Enabling Cost-Effective and Highly-Efficient Large Generative Model Inference with Unstructured Sparsity☆243Sep 24, 2023Updated 2 years ago
- Github repository for CLAPACK (fork of CLAPACK 3.2.1 patched for our needs)☆10Aug 15, 2018Updated 7 years ago
- ☆105Sep 9, 2024Updated last year
- ☆19Feb 18, 2025Updated last year
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Medusa: Accelerating Serverless LLM Inference with Materialization [ASPLOS'25]☆12Nov 8, 2024Updated last year
- Official implementation of paper "HiAE: A High-Throughput Authenticated Encryption Algorithm for Cross-Platfor Efficiency"☆19Nov 11, 2025Updated 5 months ago
- Please visit https://github.com/HKUSTDial/NL2SQL360 to get the official code!☆10Sep 1, 2024Updated last year
- ☆26Oct 9, 2025Updated 6 months ago
- B站爬虫☆15Dec 10, 2023Updated 2 years ago
- A book about Ph.D. student and research career planning☆29Oct 21, 2025Updated 6 months ago
- Next-Generation AI-Assisted Kernel Engineering for Multi-Chip Systems☆45Apr 15, 2026Updated 2 weeks ago
- Samoyeds: Accelerating MoE Models with Structured Sparsity Leveraging Sparse Tensor Cores (EuroSys'25)☆15Jul 17, 2025Updated 9 months ago
- homework in SCUT_SE☆12Nov 9, 2021Updated 4 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- [HPCA 2026] A GPU-optimized system for efficient long-context LLMs decoding with low-bit KV cache.☆86Dec 18, 2025Updated 4 months ago
- Interpreting Chest X-rays Like a Radiologist: A Benchmark with Clinical Reasoning, release the dataset and the model weight☆13May 26, 2025Updated 11 months ago
- A DAG processor and compiler for a tree-based spatial datapath.☆16Aug 24, 2022Updated 3 years ago
- GBDT-based model with efficient unlearning (SIGMOD 2023)☆10Sep 7, 2025Updated 7 months ago
- ☆47Sep 8, 2025Updated 7 months ago
- A throughput-oriented high-performance serving framework for LLMs☆954Mar 29, 2026Updated last month
- ☆14Apr 24, 2024Updated 2 years ago