☆32Jan 1, 2024Updated 2 years ago
Alternatives and similar repositories for fim-llama-deepspeed
Users that are interested in fim-llama-deepspeed are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Simple implementation of muP, based on Spectral Condition for Feature Learning. The implementation is SGD only, dont use it for Adam☆87Jul 28, 2024Updated last year
- ☆16Jun 5, 2022Updated 4 years ago
- Restaurant Recommender System☆15May 5, 2022Updated 4 years ago
- This action allows caching dependencies and build outputs to improve workflow execution time on self hosted machine.☆66Jun 6, 2025Updated last year
- An official codebase for "NormLens: Reading Books is Great, But Not if You Are Driving! Visually Grounded Reasoning about Defeasible Comm…☆10May 9, 2024Updated 2 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- ☆27May 3, 2024Updated 2 years ago
- 🐜🔧 A minimalistic tool to fine-tune your LLMs☆18Aug 17, 2023Updated 2 years ago
- Using multiple LLMs for ensemble Forecasting☆16Jan 17, 2024Updated 2 years ago
- Loader extension for tabbyAPI in SillyTavern☆26Jun 30, 2025Updated 11 months ago
- A framework for few-shot evaluation of autoregressive language models.☆16Aug 23, 2023Updated 2 years ago
- ☆13Jun 3, 2024Updated 2 years ago
- An unofficial implementation of SOLAR-10.7B model and the newly proposed interlocked-DUS(iDUS) implementation and experiment details.☆14Mar 20, 2024Updated 2 years ago
- ☆18Jun 15, 2023Updated 2 years ago
- Create paraphrasing korean sentence with GPT-3☆35Jan 30, 2023Updated 3 years ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- Simple Model Similarities Analysis☆21Feb 3, 2024Updated 2 years ago
- Layout Analysis Dataset with Segmonto (LADaS)☆25May 29, 2026Updated 2 weeks ago
- AIvilization is a civilization that artificial intelligence creates on its own. Within this civilization, AIs find innovative ways to hel…☆78Apr 19, 2024Updated 2 years ago
- ☆10Apr 8, 2021Updated 5 years ago
- Rust bindings for CTranslate2☆14Jun 21, 2023Updated 2 years ago
- Sakura-SOLAR-DPO: Merge, SFT, and DPO☆116Dec 30, 2023Updated 2 years ago
- Modular task agnostic training pipeline using LFM2 from Liquid AI with unsloth.☆16Sep 13, 2025Updated 9 months ago
- ☆40Apr 27, 2024Updated 2 years ago
- ☆11Dec 22, 2020Updated 5 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Understanding the correlation between different LLM benchmarks☆29Jan 11, 2024Updated 2 years ago
- [EMNLP 2023] Official implementation of the algorithm ETSC: Exact Toeplitz-to-SSM Conversion our EMNLP 2023 paper - Accelerating Toeplitz…☆14Oct 17, 2023Updated 2 years ago
- sigma-MoE layer☆21Jan 5, 2024Updated 2 years ago
- ☆13Oct 31, 2022Updated 3 years ago
- DSTC11 Track 5 - Task-oriented Conversational Modeling with Subjective Knowledge☆45Jun 12, 2023Updated 3 years ago
- Set of scripts to finetune LLMs☆38Mar 30, 2024Updated 2 years ago
- Corpus to accompany: "Selective Vision is the Challenge for Visual Reasoning: A Benchmark for Visual Argument Understanding"☆11Apr 11, 2025Updated last year
- JsonTuning: Towards Generalizable, Robust, and Controllable Instruction Tuning☆10Nov 3, 2024Updated last year
- A single repo with all scripts and utils to train / fine-tune the Mamba model with or without FIM☆62Apr 8, 2024Updated 2 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Semi-supervised spoken language understanding (SLU) via self-supervised speech and language model pretraining☆12Mar 23, 2021Updated 5 years ago
- A unix pipeline utils based on LLM☆16May 15, 2023Updated 3 years ago
- 🚀 Send custom messages to Slack to notify about application status, progress, errors, health checks etc.☆13Feb 18, 2025Updated last year
- Repo hosting codes and materials related to speeding LLMs' inference using token merging.☆37Oct 9, 2025Updated 8 months ago
- ☆74Sep 5, 2023Updated 2 years ago
- ☆21Nov 24, 2022Updated 3 years ago
- Implementation of paper "End-to-end lyrics alignment for polyphonic music using an audio-to-character recognition model"☆18Nov 20, 2022Updated 3 years ago