Jaykef / ai-algorithms
First-principle implementations of various AI algorithms using a wide range of deep learning frameworks, accompanied by relevant research papers
☆24Updated this week
Related projects: ⓘ
- ☆22Updated 3 months ago
- ☆42Updated 3 weeks ago
- Hugging Face Inference Toolkit used to serve transformers, sentence-transformers, and diffusers models.☆34Updated 3 weeks ago
- Improving Text Embedding of Language Models Using Contrastive Fine-tuning☆54Updated last month
- A public implementation of the ReLoRA pretraining method, built on Lightning-AI's Pytorch Lightning suite.☆33Updated 6 months ago
- ☆30Updated 4 months ago
- ☆13Updated last year
- ☆59Updated last week
- Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment☆39Updated 3 weeks ago
- ☆50Updated 3 months ago
- ☆50Updated last month
- ☆86Updated 3 weeks ago
- Implementation of the Mamba SSM with hf_integration.☆55Updated 3 weeks ago
- Data preparation code for CrystalCoder 7B LLM☆42Updated 4 months ago
- ☆29Updated 2 weeks ago
- Official repository for the paper "SwitchHead: Accelerating Transformers with Mixture-of-Experts Attention"☆87Updated 8 months ago
- PyTorch implementation of models from the Zamba2 series.☆63Updated last month
- Collection of autoregressive model implementation☆62Updated 2 weeks ago
- ☆36Updated 3 months ago
- ☆29Updated 3 weeks ago
- A list of language models with permissive licenses such as MIT or Apache 2.0☆21Updated 3 weeks ago
- ☆39Updated last week
- Mixture of Expert (MoE) techniques for enhancing LLM performance through expert-driven prompt mapping and adapter combinations.☆12Updated 7 months ago
- A repository for research on medium sized language models.☆71Updated 3 months ago
- A single repo with all scripts and utils to train / fine-tune the Mamba model with or without FIM☆46Updated 5 months ago
- A byte-level decoder architecture that matches the performance of tokenized Transformers.☆57Updated 4 months ago
- NeurIPS 2023 - Cappy: Outperforming and Boosting Large Multi-Task LMs with a Small Scorer☆34Updated 5 months ago
- The Benefits of a Concise Chain of Thought on Problem Solving in Large Language Models☆20Updated 7 months ago
- ReBase: Training Task Experts through Retrieval Based Distillation☆27Updated 2 months ago
- Cascade Speculative Drafting☆23Updated 5 months ago