Chivier / easy-gpt4o
Easy-GPT4O opensource version
☆75Updated 10 months ago
Alternatives and similar repositories for easy-gpt4o:
Users that are interested in easy-gpt4o are comparing it to the libraries listed below
- LLM-powered Python☆14Updated 2 months ago
- ☆45Updated 9 months ago
- [OSDI'24] Serving LLM-based Applications Efficiently with Semantic Variable☆150Updated 6 months ago
- Pretrain, finetune and serve LLMs on Intel platforms with Ray☆123Updated last week
- Compare different hardware platforms via the Roofline Model for LLM inference tasks.☆93Updated last year
- ☆116Updated 11 months ago
- DashInfer is a native LLM inference engine aiming to deliver industry-leading performance atop various hardware architectures, including …☆240Updated 3 weeks ago
- ❓Curie: Automated and Rigorous Scientific Experimentation with AI Agents☆54Updated this week
- 🌟 Revolutionize Your Operations with One Sentence Automation: Utilizing large language models and Multi-Agents to generate operational c…☆53Updated last year
- A large-scale simulation framework for LLM inference☆355Updated 4 months ago
- Modular and structured prompt caching for low-latency LLM inference☆89Updated 4 months ago
- Code for MLSys 2024 Paper "SiDA-MoE: Sparsity-Inspired Data-Aware Serving for Efficient and Scalable Large Mixture-of-Experts Models"☆16Updated 11 months ago
- 🔥Your Daily Dose of AI Research from Hugging Face 🔥 Stay updated with the latest AI breakthroughs! This bot automatically collects and…☆49Updated this week
- Stateful LLM Serving☆50Updated 3 weeks ago
- A MoE impl for PyTorch, [ATC'23] SmartMoE☆61Updated last year
- PLM: Efficient Peripheral Language Models Hardware-Co-Designed for Ubiquitous Computing☆14Updated 2 weeks ago
- Official implementation of ICML 2024 paper "ExCP: Extreme LLM Checkpoint Compression via Weight-Momentum Joint Shrinking".☆47Updated 8 months ago
- 大模型推理框架加速,让 LLM 飞起来☆19Updated 10 months ago
- ☆49Updated last year
- ☆94Updated 5 months ago
- Using Llama-3.1 70b on Groq to create o1-like reasoning chains☆19Updated 6 months ago
- Self-host LLMs with LMDeploy and BentoML☆18Updated 2 weeks ago
- Multi-Faceted AI Agent and Workflow Autotuning. Automatically optimizes LangChain, LangGraph, DSPy programs for better quality, lower exe…☆199Updated last week
- ☆19Updated 3 months ago
- ☆54Updated 2 months ago
- Skywork-MoE: A Deep Dive into Training Techniques for Mixture-of-Experts Language Models☆130Updated 9 months ago
- GLM Series Edge Models☆131Updated last month
- LLM Serving Performance Evaluation Harness☆73Updated last month
- High performance Transformer implementation in C++.☆113Updated 2 months ago
- PipeInfer: Accelerating LLM Inference using Asynchronous Pipelined Speculation☆28Updated 4 months ago