microsoft / BitNet
Official inference framework for 1-bit LLMs
β12,615Updated 3 weeks ago
Alternatives and similar repositories for BitNet:
Users that are interested in BitNet are comparing it to the libraries listed below
- Meta Lingua: a lean, efficient, and easy-to-hack codebase to research LLMs.β4,399Updated this week
- 20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.β11,197Updated this week
- Run your own AI cluster at home with everyday devices π±π» π₯οΈββ18,680Updated this week
- Run PyTorch LLMs locally on servers, desktop and mobileβ3,462Updated this week
- Go ahead and axolotl questionsβ8,293Updated this week
- Finetune Llama 3.3, Mistral, Phi-4, Qwen 2.5 & Gemma LLMs 2-5x faster with 70% less memoryβ20,611Updated this week
- An LLM-powered knowledge curation system that researches a topic and generates a full-length report with citations.β20,208Updated this week
- Python SDK, Proxy Server (LLM Gateway) to call 100+ LLM APIs in OpenAI format - [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthropic, Sagβ¦β16,235Updated this week
- PyTorch native post-training libraryβ4,703Updated this week
- SWE-agent takes a GitHub issue and tries to automatically fix it, using GPT-4, or your LM of choice. It can also be employed for offensivβ¦β14,188Updated this week
- Implementation for MatMul-free LM.β2,941Updated 2 months ago
- g1: Using Llama-3.1 70b on Groq to create o1-like reasoning chainsβ4,147Updated last month
- π€ smolagents: a barebones library for agents. Agents write python code to call tools and orchestrate other agents.β5,197Updated this week
- Perplexica is an AI-powered search engine. It is an Open source alternative to Perplexity AIβ18,641Updated last week
- llama3 implementation one matrix multiplication at a timeβ14,030Updated 7 months ago
- Transformer Explained Visually: Learn How LLM Transformer Models Work with Interactive Visualizationβ3,791Updated 3 weeks ago
- Make websites accessible for AI agentsβ14,568Updated this week
- A collection of notebooks/recipes showcasing some fun and effective ways of using Claude.β9,387Updated this week
- Build multi-modal Agents with memory, knowledge, tools and reasoning. Chat with them using a beautiful Agent UI.β17,869Updated this week
- Welcome to the Llama Cookbook! This is your go to guide for Building with Llama: Getting started with Inference, Fine-Tuning, RAG. We alsβ¦β15,910Updated this week
- CoreNet: A library for training deep neural networksβ6,990Updated 3 months ago
- β2,802Updated 4 months ago
- An open-source RAG-based tool for chatting with your documents.β20,315Updated this week
- Blazingly fast LLM inference.β4,826Updated this week
- LLocalSearch is a completely locally running search aggregator using LLM Agents. The user can ask a question and the system will use a chβ¦β5,783Updated 4 months ago
- A framework for serving and evaluating LLM routers - save LLM costs without compromising qualityβ3,442Updated 5 months ago
- SGLang is a fast serving framework for large language models and vision language models.β7,353Updated this week
- Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.β9,326Updated 6 months ago
- A nanoGPT pipeline packed in a spreadsheetβ2,059Updated 7 months ago
- Educational framework exploring ergonomic, lightweight multi-agent orchestration. Managed by OpenAI Solution team.β17,712Updated 3 months ago