Accelerating LLM inference with techniques like speculative decoding, quantization, and kernel fusion, focusing on implementing state-of-the-art research papers.
☆11Jul 1, 2025Updated 7 months ago
Alternatives and similar repositories for fast-llm-inference
Users that are interested in fast-llm-inference are comparing it to the libraries listed below
Sorting:
- ☆11Feb 15, 2026Updated last week
- A simple WeChat Official Account layout tool based on Dify☆17Jun 27, 2025Updated 7 months ago
- Difyで作る生成AIアプリ完全入門☆17May 25, 2025Updated 9 months ago
- OpenMindedChatbot is a Proof Of Concept that leverages the power of Open source Large Language Models (LLM) with Function Calling capabil…☆30Dec 19, 2023Updated 2 years ago
- ☆22Feb 14, 2026Updated last week
- Write the database metadata into the dify knowledge☆12Dec 30, 2025Updated last month
- A full-stack AI-powered business intelligence tool for non-experts, featuring serverless backend processing and a secure Streamlit fronte…☆27Feb 13, 2026Updated last week
- Workflow automation, but you just describe what you want and it happens.☆26Nov 22, 2025Updated 3 months ago
- ☆28Dec 4, 2025Updated 2 months ago
- ☆11Aug 29, 2025Updated 5 months ago
- General Use Timeseries Containers for Rust☆11Dec 31, 2020Updated 5 years ago
- 100 Production-Ready Claude Code Skills - The most comprehensive collection of AI skills for sales, business automation, content creation…☆36Oct 22, 2025Updated 4 months ago
- 🤖AI Agents for Financial Trading💰: LLM-Driven Stock Prediction & Investment Recommendation System☆13Apr 14, 2025Updated 10 months ago
- LangReact 是一个配置化的 Planning Agent 应用开发工具,通过配置、插件,能快速为你的 GPT 应用提供 Planning 功能。☆12Apr 23, 2024Updated last year
- ☆28Feb 14, 2026Updated last week
- 知予人工智能:从学习者到研究者☆13Jan 20, 2025Updated last year
- Python Telegraph api.☆15Mar 22, 2025Updated 11 months ago
- MyAgents - 桌面版通用Agent,让非开发者也能体会到来自 AI 智能的推背感,成就更好的自己。☆45Updated this week
- ☆10Apr 30, 2025Updated 9 months ago
- ☆10Dec 29, 2023Updated 2 years ago
- Knowledge sharing of AWS (Amazon Web Services) Cloud☆12Jun 7, 2021Updated 4 years ago
- ☆12Jun 28, 2024Updated last year
- A web interface for SleekDB written in PHP☆11Jan 22, 2022Updated 4 years ago
- Use the knowledge graph generated by GraphRAG as the external knowledge base for the Dify workflow.☆21Jun 4, 2025Updated 8 months ago
- A small framework to benchmark forecasting models via backtesting☆13Nov 25, 2023Updated 2 years ago
- An experimental distributed map reduce system based on Google's MapReduce, written in Rust!☆10Aug 3, 2022Updated 3 years ago
- A distilled DeepSeek-R1 variant built on Qwen2.5-32B, fine-tuned with curated data for enhanced performance and efficiency. <metadata> gp…☆16Mar 11, 2025Updated 11 months ago
- ☆28Jun 27, 2025Updated 7 months ago
- An SSH plugin for Dify☆12Jan 16, 2026Updated last month
- 🎵 When AI tools vibe together on your PRs. Let CodeRabbit and Claude Code handle the repetitive feedback while you ship features. Built …☆12Nov 24, 2025Updated 3 months ago
- This is a fork from Ryan Carson's AI Dev Tasks repository, with some code cleanup and refactoring to enable support for PostgreSQL databa…☆15Sep 8, 2025Updated 5 months ago
- dify 知识库检索工具☆13Apr 3, 2025Updated 10 months ago
- Analytics tool that applies Natural Language Processing (NLP) and Machine Learning (ML), such as concept extraction, idea classification,…☆10Dec 7, 2022Updated 3 years ago
- Java implementation for the Agent2Agent Protocol (A2A - https://github.com/google/A2A), enabling interaction between AI agents through a …☆11Apr 21, 2025Updated 10 months ago
- An Offline and Secure Retrieval-Augmented Generation (RAG) system designed for efficient processing of diverse content types with minimal…☆20Dec 29, 2024Updated last year
- A multi-agent framework to help with your homework.☆10Mar 1, 2025Updated 11 months ago
- A simple AI developer agent☆22Jul 21, 2025Updated 7 months ago
- Open-source repository for the OOPSLA'24 paper "CYCLE: Learning to Self-Refine Code Generation"☆10Mar 8, 2024Updated last year
- 纯c++的全平台llm加速库,支持python调用,支持chatglm-6B, llama, baichuan, moss基座,x86 / ARM☆12Jan 30, 2026Updated 3 weeks ago