Aleph-Alpha / trigrams
☆29Updated 3 weeks ago
Related projects: ⓘ
- ☆50Updated last month
- ☆29Updated 2 weeks ago
- Collection of autoregressive model implementation☆62Updated 2 weeks ago
- ☆42Updated 3 weeks ago
- ☆48Updated 6 months ago
- Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment☆39Updated 2 weeks ago
- ☆75Updated 3 weeks ago
- Scaling is a distributed training library and installable dependency designed to scale up neural networks, with a dedicated module for tr…☆38Updated 3 weeks ago
- A repository for research on medium sized language models.☆71Updated 3 months ago
- A byte-level decoder architecture that matches the performance of tokenized Transformers.☆57Updated 4 months ago
- PyTorch implementation of models from the Zamba2 series.☆63Updated last month
- Tree Attention: Topology-aware Decoding for Long-Context Attention on GPU clusters☆94Updated 2 weeks ago
- ☆55Updated 9 months ago
- ☆38Updated 8 months ago
- Cold Compress is a hackable, lightweight, and open-source toolkit for creating and benchmarking cache compression methods built on top of…☆73Updated last month
- Experiments for efforts to train a new and improved t5☆76Updated 5 months ago
- ☆73Updated 5 months ago
- Simple replication of [ColBERT-v1](https://arxiv.org/abs/2004.12832).☆73Updated 6 months ago
- ☆22Updated last year
- ☆22Updated 3 months ago
- A toolkit for fine-tuning, inferencing, and evaluating GreenBitAI's LLMs.☆68Updated 2 months ago
- BPE modification that implements removing of the intermediate tokens during tokenizer training.☆13Updated last week
- [WIP] Transformer to embed Danbooru labelsets☆13Updated 5 months ago
- A single repo with all scripts and utils to train / fine-tune the Mamba model with or without FIM☆46Updated 5 months ago
- ☆40Updated 4 months ago
- ☆53Updated 8 months ago
- Official repository for the paper "Approximating Two-Layer Feedforward Networks for Efficient Transformers"☆34Updated 10 months ago
- NLP with Rust for Python 🦀🐍☆57Updated 3 months ago
- Improving Text Embedding of Language Models Using Contrastive Fine-tuning☆54Updated last month
- ☆68Updated 2 months ago