facebookresearch / chai
CHAI is a library for dynamic pruning of attention heads for efficient LLM inference.
☆13Updated 4 months ago
Alternatives and similar repositories for chai:
Users that are interested in chai are comparing it to the libraries listed below
- Official repo of dataset-decomposition paper [NeurIPS 2024]☆16Updated 3 months ago
- [Oral; Neurips OPT2024 ] μLO: Compute-Efficient Meta-Generalization of Learned Optimizers☆12Updated last month
- This library supports evaluating disparities in generated image quality, diversity, and consistency between geographic regions.☆20Updated 10 months ago
- Here we will test various linear attention designs.☆60Updated last year
- An open source replication of the stawberry method that leverages Monte Carlo Search with PPO and or DPO☆29Updated this week
- GoldFinch and other hybrid transformer components☆10Updated 3 weeks ago
- Unofficial implementation of Neural Analysis and Synthesis☆7Updated 3 years ago
- JAX Scalify: end-to-end scaled arithmetics☆16Updated 5 months ago
- Implementation of the LDP module block in PyTorch and Zeta from the paper: "MobileVLM: A Fast, Strong and Open Vision Language Assistant …☆16Updated last year
- implementation of https://arxiv.org/pdf/2312.09299☆20Updated 9 months ago
- Training hybrid models for dummies.☆20Updated 3 months ago
- Generate audio datasets for training Text-To-Speech models, through smart audio splitting with silence detection, and transcription using…☆28Updated last year
- A compressed alternative to matrix multiplication using state-of-the art compression ROBE-Z☆9Updated last year
- Implementation of Google's USM speech model in Pytorch☆31Updated 2 weeks ago
- Official Code Implementation for 'A Simple Early Exiting Framework for Accelerated Sampling in Diffusion Models'☆18Updated 9 months ago
- ☆31Updated last year
- Code for the note "NF4 Isn't Information Theoretically Optimal (and that's Good)☆18Updated last year
- URL downloader supporting checkpointing and continuous checksumming.☆19Updated last year
- LayerNorm(SmallInit(Embedding)) in a Transformer to improve convergence☆60Updated 3 years ago
- ☆33Updated 7 months ago
- Triton kernels for Flux☆20Updated 3 months ago
- Implementation of Spectral State Space Models☆16Updated last year
- ☆21Updated 4 months ago
- ☆14Updated last month
- Implementation of a Light Recurrent Unit in Pytorch☆47Updated 6 months ago
- Benchmarking PyTorch 2.0 different models☆21Updated 2 years ago
- Visualising Losses in Deep Neural Networks☆16Updated 9 months ago
- Official code implementation for the work Preference Alignment with Flow Matching (NeurIPS 2024)☆49Updated 5 months ago
- Efficiently computing & storing token n-grams from large corpora☆23Updated 6 months ago
- Exploring an idea where one forgets about efficiency and carries out attention across each edge of the nodes (tokens)☆50Updated last month