1.58-bit LLaMa model
☆83Apr 3, 2024Updated last year
Alternatives and similar repositories for bllama
Users that are interested in bllama are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Experimental BitNet Implementation☆74Nov 27, 2025Updated 4 months ago
- a lightweight, open-source blueprint for building powerful and scalable LLM chat applications☆28Jun 7, 2024Updated last year
- An efficent implementation of the method proposed in "The Era of 1-bit LLMs"☆155Oct 15, 2024Updated last year
- Tiny ASIC implementation for "The Era of 1-bit LLMs All Large Language Models are in 1.58 Bits" matrix multiplication unit☆181Apr 19, 2024Updated last year
- Download full or partial git-lfs repos without temporarily using 2x disk space☆31Oct 13, 2023Updated 2 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- The repository of paper Personalized Multimodal Response Generation with Large Language Models☆18Jun 28, 2024Updated last year
- MCP tools for Rust Context Engineering (rustdocs, rust analyzer)☆16Feb 8, 2026Updated last month
- A pure and fast NumPy implementation of Mamba with cache support.☆18Jun 16, 2024Updated last year
- ☆41Feb 14, 2026Updated last month
- ☆12Feb 23, 2023Updated 3 years ago
- Modeling code for a BitNet b1.58 Llama-style model.☆25Apr 30, 2024Updated last year
- A sleek, customizable interface for managing LLMs with responsive design and easy agent personalization.☆17Aug 30, 2024Updated last year
- John Shutt's "Kernel" language implemented on ABE (C) runtime.☆13Sep 3, 2018Updated 7 years ago
- Modified Beam Search with periodical restart☆12Sep 12, 2024Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- The DPAB-α Benchmark☆32Jan 15, 2025Updated last year
- Your personal ArXiv Feed☆23Dec 18, 2024Updated last year
- Inference Llama 2 in one file of pure C☆14Jul 24, 2023Updated 2 years ago
- alternative way to calculating self attention☆18May 25, 2024Updated last year
- ☆51Feb 19, 2025Updated last year
- Advanced Ultra-Low Bitrate Compression Techniques for the LLaMA Family of LLMs☆110Jan 11, 2024Updated 2 years ago
- Train your own small bitnet model☆78Oct 20, 2024Updated last year
- [ACL 2025 Main] EfficientQAT: Efficient Quantization-Aware Training for Large Language Models☆333Nov 26, 2025Updated 4 months ago
- SLOP Detector and analyzer based on dictionary for shareGPT JSON and text☆84Mar 12, 2026Updated 2 weeks ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- ☆581Oct 29, 2024Updated last year
- ☆51May 31, 2024Updated last year
- This is work done by the Oxen.ai Community, trying to reproduce the Self-Rewarding Language Model paper from MetaAI.☆133Nov 16, 2024Updated last year
- A universal adapter including zero-copy Python bindings for Philip Turner's metal flash attention library.☆24Dec 15, 2025Updated 3 months ago
- various experiments for scaling inference time compute with small reasoning models☆17Jan 16, 2025Updated last year
- 🌳 MCTS-inspired parallel beam search for conversation optimization. Explore multiple dialogue strategies simultaneously, stress-test a…☆35Jan 18, 2026Updated 2 months ago
- AirLLM 70B inference with single 4GB GPU☆20Jun 27, 2025Updated 9 months ago
- Implementation of "BitNet: Scaling 1-bit Transformers for Large Language Models" in pytorch☆1,904Mar 20, 2026Updated last week
- AI Based "Happiness Optimizer"☆12Oct 20, 2024Updated last year
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- A ComfyUI node for transforming images into descriptive text using templated visual question answering. Leverages Hugging Face's VQA mode…☆12Apr 1, 2025Updated 11 months ago
- ☆19Apr 29, 2024Updated last year
- A simple library for working with Hugging Face models.☆14Dec 30, 2024Updated last year
- This is the OFFICIAL CybernetiX S3C website.☆22Feb 4, 2026Updated last month
- ☆17Feb 29, 2024Updated 2 years ago
- A fast inference library for running LLMs locally on modern consumer-class GPUs☆4,476Mar 4, 2026Updated 3 weeks ago
- AI Assistant☆20Feb 21, 2026Updated last month