Implementation of IceFormer: Accelerated Inference with Long-Sequence Transformers on CPUs (ICLR 2024).
☆25Feb 22, 2026Updated last month
Alternatives and similar repositories for IceFormer
Users that are interested in IceFormer are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Load and run Llama from safetensors files in C☆15Oct 24, 2024Updated last year
- Accelerate multihead attention transformer model using HLS for FPGA☆11Dec 7, 2023Updated 2 years ago
- Longitudinal Evaluation of LLMs via Data Compression☆33May 29, 2024Updated last year
- Benchmark tests supporting the TiledCUDA library.☆18Nov 19, 2024Updated last year
- ☆14Mar 22, 2024Updated 2 years ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- A high-throughput and memory-efficient inference and serving engine for LLMs☆17Jun 3, 2024Updated last year
- A DEMO for “Local Transformer With Spatial Partition Restore for Hyperspectral Image Classification (Xue et al., JSTARS, 2022)”☆16Apr 17, 2024Updated last year
- ☆11Nov 22, 2025Updated 4 months ago
- ☆17Jul 29, 2024Updated last year
- Contrastive-Mutual-Learning-with-Pseudo-Label-Smoothing-for-Hyperspectral-Image-Classification☆14Jan 16, 2025Updated last year
- Belief Revision based Caption Re-ranker with Visual Semantic Information. COLING 2022☆11Apr 13, 2025Updated 11 months ago
- Optimizing the Deployment of Tiny Transformers on Low-Power MCUs☆33Sep 2, 2024Updated last year
- Mediapipe 0.10.1 with CUDA GPU Support python libs☆10Dec 1, 2023Updated 2 years ago
- SGEMM optimization with cuda step by step☆22Mar 23, 2024Updated 2 years ago
- NordVPN Threat Protection Pro™ • AdTake your cybersecurity to the next level. Block phishing, malware, trackers, and ads. Lightweight app that works with all browsers.
- ☆13Jan 7, 2025Updated last year
- Code (Pytorch) of "Multiple vision architectures-based hybrid network for hyperspectral image classification" ESWA-07/2023 Accepted.☆14Jul 20, 2023Updated 2 years ago
- 高光谱图像计算机视觉分类图像预处理工具集,包含去除图片无关背景,数据增强,生成标签文件等功能☆18Nov 4, 2023Updated 2 years ago
- ☆72Mar 26, 2025Updated last year
- Running inference on the ZeroSCROLLS benchmark☆20Apr 18, 2024Updated last year
- Whisper in TensorRT-LLM☆17Sep 21, 2023Updated 2 years ago
- Multiple GEMM operators are constructed with cutlass to support LLM inference.☆19Aug 3, 2025Updated 7 months ago
- CACFTNet: A Hybrid Cov-Attention and Cross-Layer Fusion Transformer Network for Hyperspectral Image Classification[J]," in IEEE Transacti…☆16Aug 13, 2024Updated last year
- QAQ: Quality Adaptive Quantization for LLM KV Cache☆53Mar 27, 2024Updated 2 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- ☆32May 26, 2024Updated last year
- ☆16Mar 13, 2023Updated 3 years ago
- ☆311Jul 10, 2025Updated 8 months ago
- Neptasm Mod☆10Jul 22, 2023Updated 2 years ago
- An efficient spatial accelerator enabling hybrid sparse attention mechanisms for long sequences☆32Mar 7, 2024Updated 2 years ago
- [ICML 2024 Oral] Any-Precision LLM: Low-Cost Deployment of Multiple, Different-Sized LLMs☆123Jul 4, 2025Updated 8 months ago
- Official repository of "Distort, Distract, Decode: Instruction-Tuned Model Can Refine its Response from Noisy Instructions", ICLR 2024 Sp…☆21Mar 7, 2024Updated 2 years ago
- ☆18Jan 27, 2025Updated last year
- The source code and dataset mentioned in the paper Seal-Tools: Self-Instruct Tool Learning Dataset for Agent Tuning and Detailed Benchmar…☆53Nov 5, 2024Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- A Python package mapping 2D coordinates to colors based on different 2D color maps.☆16Dec 4, 2025Updated 3 months ago
- Segment Anything (SAM) at Home web app using Gradio☆14Aug 7, 2023Updated 2 years ago
- Standalone Flash Attention v2 kernel without libtorch dependency☆112Sep 10, 2024Updated last year
- ☆38Oct 21, 2025Updated 5 months ago
- Python Binding for waifu2x-ncnn-vulkan with PyBind11☆17Dec 30, 2023Updated 2 years ago
- Proteus: A High-Throughput Inference-Serving System with Accuracy Scaling☆12Mar 7, 2024Updated 2 years ago
- Dumpy: A Compact and Adaptive Index for Large Data Series Collections (SIGMOD'23)☆13Dec 12, 2023Updated 2 years ago