smart-lty / nano-PEARLLinks
Draft-Target Disaggregation LLM Serving System via Parallel Speculative Decoding.
☆126Updated last month
Alternatives and similar repositories for nano-PEARL
Users that are interested in nano-PEARL are comparing it to the libraries listed below
Sorting:
- A high-performance inference engine for LLMs, optimized for diverse AI accelerators.☆801Updated this week
- AI Infra主要是指AI的基础建设,包括AI芯片、AI编译器、AI推理和训练框架等AI全栈底层技术。☆255Updated last year
- [EMNLP 2025] RAG-Instruct: Boosting LLMs with Diverse Retrieval-Augmented Instructions☆136Updated 8 months ago
- Deep Research☆303Updated 3 months ago
- [NeurIPS 2025 spotlight] QFFT, Question-Free Fine-Tuning for Adaptive Reasoning☆90Updated last month
- ☆71Updated 5 months ago
- Omni Model Benchmark with high quality and diversity, which reveals the Compositional Law. We’re now focused on Chinese scenarios — and a…☆75Updated this week
- [ICCV 2023] RepQ-ViT: Scale Reparameterization for Post-Training Quantization of Vision Transformers☆138Updated last year
- Official repo of Toucan: Synthesizing 1.5M Tool-Agentic Data from Real-World MCP Environments☆185Updated 2 months ago
- MCPMark is a comprehensive, stress-testing MCP benchmark designed to evaluate model and agent capabilities in real-world MCP use.☆342Updated last week
- ☆23Updated last month
- ☆118Updated last month
- BuildArena, where LLM agents design, build, and test rockets, cars, and bridges in a physics simulator given a goal-directed sentence.☆80Updated last month
- An open-source agentic framework that enables AI to use computers like humans and can provide a multi-agent runtime environment as an inf…☆114Updated this week
- Trainable fast and memory-efficient sparse attention☆480Updated this week
- OK Computer in a Box: Your Self-Hosted Agent Workflow Layer☆123Updated 2 weeks ago
- Python port of Moses tokenizer, truecaser and normalizer☆112Updated 2 years ago
- Local nonlinear causal attention latent diffusion models for visual story synthesizing☆32Updated 8 months ago
- ☆30Updated 2 months ago
- [ICCV 2023] I-ViT: Integer-only Quantization for Efficient Vision Transformer Inference☆192Updated last year
- ⚡Fast-start scaffold for Gin Framework APIs. Includes MySQL, Redis-powered JWT auth, and a well-structured architecture to launch your Go…☆44Updated last month
- Learn how to develop kernels☆70Updated this week
- 整理了各大厂的 GitHub 地址及热门开源项目,帮助大家更高效地了解国产开源生态☆112Updated 5 months ago
- ☆46Updated this week
- The directed brute force cracking tool, after collecting information, uses it to generate a special dictionary containing the feature inf…☆39Updated 2 years ago
- build PyTorch with CUDA for Jetson Orin and Thor.☆33Updated last week
- 基于Vue+Node的西餐社区+商城 分为用户 管理员 Vue+Koa+MySQL☆74Updated 10 months ago
- ☆30Updated 2 months ago
- 基于飞腾派设计的校园快递智能避障系统☆113Updated 5 months ago
- ☆30Updated 2 months ago