BitLinear implementation
☆35Jan 1, 2026Updated 2 months ago
Alternatives and similar repositories for bitlinear
Users that are interested in bitlinear are comparing it to the libraries listed below
Sorting:
- Implementation of the BitLinear layer from: The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits☆13Sep 11, 2024Updated last year
- [ICML-2025] We introduce Lie group Relative position Encodings (LieRE) that goes beyond RoPE in supporting n-dimensional inputs.☆14Aug 8, 2025Updated 6 months ago
- RPC/XDR protocol compiler (from jungerl)☆14Oct 4, 2019Updated 6 years ago
- recipe for training fully-featured self supervised image jepa models☆12Jun 4, 2025Updated 9 months ago
- Kylie is a blond and small Erlang/Elixir client for Cayley graph data base☆12Feb 15, 2026Updated 2 weeks ago
- MOCA: Self-supervised Representation Learning by Predicting Masked Online Codebook Assignments☆13Jul 8, 2024Updated last year
- ☆11Dec 23, 2018Updated 7 years ago
- [Oral; Neurips OPT2024 ] μLO: Compute-Efficient Meta-Generalization of Learned Optimizers☆15Feb 12, 2026Updated 2 weeks ago
- ☆29Feb 27, 2024Updated 2 years ago
- Say hello to ErlangRump an Erlang Microkernel powered by Rumprun unikernel☆14Sep 4, 2016Updated 9 years ago
- PyTorch implementation of StableMask (ICML'24)☆15Jun 27, 2024Updated last year
- Pytorch implementation of Planar Flow☆17Dec 2, 2019Updated 6 years ago
- [ICML‘2024] "LoCoCo: Dropping In Convolutions for Long Context Compression", Ruisi Cai, Yuandong Tian, Zhangyang Wang, Beidi Chen☆17Sep 7, 2024Updated last year
- Exthereum, The Elixir Ethereum Client☆22Nov 6, 2018Updated 7 years ago
- cbor encoder/decoder in Erlang☆24Mar 15, 2015Updated 10 years ago
- Lightweight toolkit package to train and fine-tune 1.58bit Language models☆112May 19, 2025Updated 9 months ago
- HGRN2: Gated Linear RNNs with State Expansion☆56Aug 20, 2024Updated last year
- ☆24Sep 25, 2024Updated last year
- [NAACL 2025] A Closer Look into Mixture-of-Experts in Large Language Models☆60Feb 7, 2025Updated last year
- PyTorch implementation for HyperMixing, a linear-time token-mixing technique used in HyperMixer architecture☆26Jun 12, 2023Updated 2 years ago
- Neural network quantization for research and prototyping☆42Feb 19, 2026Updated last week
- Official repository of paper "RNNs Are Not Transformers (Yet): The Key Bottleneck on In-context Retrieval"☆27Apr 17, 2024Updated last year
- An Erlang Syntactic Sugar Library☆22May 29, 2022Updated 3 years ago
- Official implementation of the paper "You Do Not Fully Utilize Transformer's Representation Capacity"☆32May 28, 2025Updated 9 months ago
- [ICML 2024] When Linear Attention Meets Autoregressive Decoding: Towards More Effective and Efficient Linearized Large Language Models☆35Jun 12, 2024Updated last year
- Small service for snapshotting eleveldb without stopping the Erlang node☆32Mar 21, 2023Updated 2 years ago
- Official implementation of the transformer (TF) architecture suggested in a paper entitled "Looped Transformers as Programmable Computers…☆30Apr 8, 2023Updated 2 years ago
- ☆35Apr 12, 2024Updated last year
- [CVPR 2023] Castling-ViT: Compressing Self-Attention via Switching Towards Linear-Angular Attention During Vision Transformer Inference☆30Mar 14, 2024Updated last year
- Replication of CRDTs☆38Apr 3, 2021Updated 4 years ago
- Fast and tiny NeRF implementation☆41Dec 5, 2023Updated 2 years ago
- ☆35Dec 12, 2023Updated 2 years ago
- Official repository for the paper "Approximating Two-Layer Feedforward Networks for Efficient Transformers"☆39Jun 11, 2025Updated 8 months ago
- All things manipulating, quantifying, and visualizing geochemical data☆13Jan 19, 2024Updated 2 years ago
- This module includes functions that can be used to simulate mechanochemical phenomena.☆11Nov 16, 2021Updated 4 years ago
- This repository contains the Parasol processor, which enables next-generation privacy preserving applications. Users can run arbitrary co…☆11Feb 25, 2026Updated last week
- Official code for the paper "Attention as a Hypernetwork"☆51Feb 24, 2026Updated last week
- Erlang on Bare Metal☆57Sep 29, 2015Updated 10 years ago
- [CVPR2025] Breaking the Low-Rank Dilemma of Linear Attention☆38Mar 11, 2025Updated 11 months ago