Emerging-AI / ENOVA
A deployment, monitoring and autoscaling service towards serverless LLM serving.
☆162Updated last week
Related projects ⓘ
Alternatives and complementary repositories for ENOVA
- MIXQ: Taming Dynamic Outliers in Mixed-Precision Quantization by Online Prediction☆81Updated 3 weeks ago
- LLM Benchmark for Code☆33Updated 3 months ago
- [COLM 2024] TriForce: Lossless Acceleration of Long Sequence Generation with Hierarchical Speculative Decoding☆230Updated 2 months ago
- Support mixed-precsion inference with vllm☆95Updated 2 weeks ago
- Mixed precision inference by Tensorrt-LLM☆93Updated 3 weeks ago
- 使用deepspeed从头开始训练一个LLM,经过pretrain和sft阶段,验证llm学习知识、理解语言、回答问题的能力☆155Updated 4 months ago
- Accurate, private and configurable document retrieval LLM☆130Updated this week
- ShadowKV: KV Cache in Shadows for High-Throughput Long-Context LLM Inference☆124Updated 3 weeks ago
- Pytorch Library for Relational Table Learning with LLMs.☆283Updated this week
- A curated list of awesome leaderboard-oriented resources for foundation models☆192Updated this week
- OSAI: Your AI-powered OS assistant. Rename files, manage environment variables, set reminders, and control your system through natural la…☆110Updated 3 months ago
- SemiEvol: Semi-supervised Fine-tuning for LLM Adaptation☆59Updated 3 weeks ago
- An acceleration library that supports arbitrary bit-width combinatorial quantization operations☆226Updated last month
- A Contextual RAG Bot Framework☆107Updated 3 weeks ago
- Engy is an AI-powered development tool that generates fully functional web applications from natural language, streamlining the process f…☆256Updated 2 weeks ago
- The Official Repo of ML-Bench: Evaluating Large Language Models and Agents for Machine Learning Tasks on Repository-Level Code (https://a…☆355Updated this week
- This is a repo for my NanoGPT Pytorch2.0 Implementation when torch2.0 released soon, faster and simpler, a good tutorial learning GPT.☆60Updated 9 months ago
- 教你只用最基本的python语法和numpy一步步实现深度学习框架☆120Updated 3 months ago
- ☆105Updated 7 months ago
- 【grps接入trtllm】通过GPRS+TensorRT-LLM+Tokenizers.cpp实现纯C++版高性能OpenAI LLM服务,支持chat和function call模式,支持ai agent,支持分布式多卡推理,支持多模态,支持gradio聊天界面。☆92Updated 2 weeks ago
- AI powered tools playground☆155Updated last year
- This is the official code repository of MoTCoder: Elevating Large Language Models with Modular of Thought for Challenging Programming Tas…☆63Updated 2 months ago
- Simple python WebUI for fine-tuning ChatGPT (gpt-3.5-turbo)☆206Updated last year
- [NeurIPS 2024] EffiBench: Benchmarking the Efficiency of Automatically Generated Code☆57Updated last month
- MPLSandbox is an out-of-the-box multi-programming language sandbox designed to provide unified and comprehensive feedback from compiler a…☆20Updated this week
- AvaTaR: Optimizing LLM Agents for Tool Usage via Contrastive Reasoning (NeurIPS 2024)☆168Updated last week
- TxBKG - Knowledge Graph Generation for Any PDFs☆223Updated last month
- Ein multimodaler, multi-intelligenter Entwicklungsrahmen☆56Updated last week
- The repository for the paper titled "Leopard: A Vision Language Model For Text-Rich Multi-Image Tasks"☆184Updated 3 weeks ago
- This tool(enhance_long) aims to enhance the LlaMa2 long context extrapolation capability in the lowest-cost approach, preferably without …☆47Updated 11 months ago