☆32Jan 1, 2024Updated 2 years ago
Alternatives and similar repositories for fim-llama-deepspeed
Users that are interested in fim-llama-deepspeed are comparing it to the libraries listed below
Sorting:
- [Corca / OR] Solver for Multi-dimensional Multi-demand Quadratic Knapsack Problems☆12Mar 22, 2022Updated 3 years ago
- Simple implementation of muP, based on Spectral Condition for Feature Learning. The implementation is SGD only, dont use it for Adam☆86Jul 28, 2024Updated last year
- An official codebase for "NormLens: Reading Books is Great, But Not if You Are Driving! Visually Grounded Reasoning about Defeasible Comm…☆10May 9, 2024Updated last year
- Rust bindings for CTranslate2☆14Jun 21, 2023Updated 2 years ago
- ☆13Jun 3, 2024Updated last year
- [EMNLP 2023] Official implementation of the algorithm ETSC: Exact Toeplitz-to-SSM Conversion our EMNLP 2023 paper - Accelerating Toeplitz…☆14Oct 17, 2023Updated 2 years ago
- An unofficial implementation of SOLAR-10.7B model and the newly proposed interlocked-DUS(iDUS) implementation and experiment details.☆14Mar 20, 2024Updated last year
- SMILE: A Multimodal Dataset for Understanding Laughter☆13Jun 15, 2023Updated 2 years ago
- A framework for few-shot evaluation of autoregressive language models.☆16Aug 23, 2023Updated 2 years ago
- sigma-MoE layer☆21Jan 5, 2024Updated 2 years ago
- Yet another minimalist deep-learning framework optimized for inference☆36Updated this week
- Github repository for Zero Shot Visual Storytelling☆15Dec 6, 2021Updated 4 years ago
- trying to make WebGPU a bit easier to use☆19Jan 9, 2024Updated 2 years ago
- Create paraphrasing korean sentence with GPT-3☆34Jan 30, 2023Updated 3 years ago
- Using multiple LLMs for ensemble Forecasting☆16Jan 17, 2024Updated 2 years ago
- Layout Analysis Dataset with Segmonto (LADaS)☆24Jul 12, 2025Updated 7 months ago
- ☆23Jan 27, 2025Updated last year
- Awesome Chinese Corpus Datasets and Models.☆18Oct 28, 2019Updated 6 years ago
- Restaurant Recommender System☆15May 5, 2022Updated 3 years ago
- Sakura-SOLAR-DPO: Merge, SFT, and DPO☆116Dec 30, 2023Updated 2 years ago
- Repo for "Smart Word Suggestions" (SWS) task and benchmark☆20Dec 4, 2023Updated 2 years ago
- [EMNLP 2023]Context Compression for Auto-regressive Transformers with Sentinel Tokens☆25Nov 6, 2023Updated 2 years ago
- ☆27May 3, 2024Updated last year
- ☆24Jun 4, 2024Updated last year
- A single repo with all scripts and utils to train / fine-tune the Mamba model with or without FIM☆61Apr 8, 2024Updated last year
- AIvilization is a civilization that artificial intelligence creates on its own. Within this civilization, AIs find innovative ways to hel…☆64Apr 19, 2024Updated last year
- Code for NeurIPS LLM Efficiency Challenge☆60Apr 9, 2024Updated last year
- ☆34Sep 10, 2024Updated last year
- A repository for research on medium sized language models.☆78May 23, 2024Updated last year
- [EMNLP'24] LongHeads: Multi-Head Attention is Secretly a Long Context Processor☆31Apr 8, 2024Updated last year
- Cascade Speculative Drafting☆33Apr 2, 2024Updated last year
- Data sets and ML models versioning example from DVC get started☆10Jun 4, 2024Updated last year
- Oak National Academy's AI Auto Eval tools provide LLM as a judge evaluation on lesson plans and resources☆17Nov 4, 2025Updated 4 months ago
- Understanding the correlation between different LLM benchmarks☆29Jan 11, 2024Updated 2 years ago
- Multipack distributed sampler for fast padding-free training of LLMs☆204Aug 10, 2024Updated last year
- Official implementation for 'Extending LLMs’ Context Window with 100 Samples'☆81Jan 18, 2024Updated 2 years ago
- Run Ollama LLM models in Google Colab for free☆38Nov 24, 2024Updated last year
- Token Omission Via Attention☆127Oct 13, 2024Updated last year
- 哔哩哔哩-API收集整理【不断更新中....】☆10Apr 25, 2025Updated 10 months ago