mlc-ai / web-llm-assistant
AI Assistant running within your browser.
☆44Updated 3 weeks ago
Related projects ⓘ
Alternatives and complementary repositories for web-llm-assistant
- ☆114Updated 7 months ago
- GroqFlow provides an automated tool flow for compiling machine learning and linear algebra workloads into Groq programs and executing tho…☆100Updated 3 weeks ago
- A generalist agent that can go online and accomplish complex tasks using semantic-kernel and autogen.☆25Updated 11 months ago
- Repo hosting codes and materials related to speeding LLMs' inference using token merging.☆29Updated 6 months ago
- Self-host LLMs with vLLM and BentoML☆74Updated last week
- 🌟EasyAGI : A generalist agent that can go online and accomplish complex tasks.☆23Updated 11 months ago
- Implementation of nougat that focuses on processing pdf locally.☆73Updated 6 months ago
- A collection of all available inference solutions for the LLMs☆73Updated 2 months ago
- ☆26Updated last year
- ☆69Updated this week
- LangChain + LiteLLM that works☆25Updated 3 weeks ago
- vLLM: A high-throughput and memory-efficient inference and serving engine for LLMs☆89Updated this week
- ☆120Updated this week
- never forget anything again! combine AI and intelligent tooling for a local knowledge base to track catalogue, annotate, and plan for you…☆32Updated 6 months ago
- Fast Inference of MoE Models with CPU-GPU Orchestration☆172Updated this week
- An open source replication of the stawberry method that leverages Monte Carlo Search with PPO and or DPO☆22Updated this week
- A toolkit for fine-tuning, inferencing, and evaluating GreenBitAI's LLMs.☆74Updated last month
- ☆50Updated 2 months ago
- A function to do all☆35Updated 7 months ago
- Official repository for the paper "NeuZip: Memory-Efficient Training and Inference with Dynamic Compression of Neural Networks". This rep…☆32Updated 3 weeks ago
- Parameter-Efficient Sparsity Crafting From Dense to Mixture-of-Experts for Instruction Tuning on General Tasks☆31Updated 6 months ago
- Augment Swarm with durable execution to help you build reliable and scalable multi-agent systems.☆72Updated 2 weeks ago
- ☆21Updated 3 months ago
- Deploy your autonomous agents to production grade environments with 99% Uptime Guarantee, Infinite Scalability, and self-healing.☆27Updated this week
- 5X faster 60% less memory QLoRA finetuning☆21Updated 5 months ago
- Tutorial to get started with SkyPilot!☆56Updated 6 months ago
- The official Python library for Formulaic☆14Updated 6 months ago
- Zephyr 7B beta RAG Demo inside a Gradio app powered by BGE Embeddings, ChromaDB, and Zephyr 7B Beta LLM.☆35Updated last year
- One click templates for inferencing Language Models☆120Updated this week
- The driver for LMCache core to run in vLLM☆11Updated this week