pHaeusler / tinycatstoriesLinks
☆10Updated 2 years ago
Alternatives and similar repositories for tinycatstories
Users that are interested in tinycatstories are comparing it to the libraries listed below
Sorting:
- Tune MPTs☆84Updated 2 years ago
- Multipack distributed sampler for fast padding-free training of LLMs☆204Updated last year
- ☆95Updated 2 years ago
- batched loras☆349Updated 2 years ago
- MiniHF is an inference, human preference data collection, and fine-tuning tool for local language models. It is intended to help the user…☆183Updated 3 months ago
- Evaluating LLMs with Dynamic Data☆111Updated 3 weeks ago
- SparseGPT + GPTQ Compression of LLMs like LLaMa, OPT, Pythia☆42Updated 2 years ago
- RWKV infctx trainer, for training arbitary context sizes, to 10k and beyond!☆148Updated last year
- Comprehensive analysis of difference in performance of QLora, Lora, and Full Finetunes.☆83Updated 2 years ago
- Experiments with generating opensource language model assistants☆97Updated 2 years ago
- A bagel, with everything.☆326Updated last year
- A lightweight, hackable, and efficient framework for training and fine-tuning language models☆187Updated last week
- TART: A plug-and-play Transformer module for task-agnostic reasoning☆202Updated 2 years ago
- Load multiple LoRA modules simultaneously and automatically switch the appropriate combination of LoRA modules to generate the best answe…☆159Updated 2 years ago
- QLoRA: Efficient Finetuning of Quantized LLMs☆79Updated last year
- Long context evaluation for large language models☆226Updated 11 months ago
- LLaMa Tuning with Stanford Alpaca Dataset using Deepspeed and Transformers☆50Updated 2 years ago
- Synthetic Role-Play Conversation Dataset Generation☆49Updated 2 years ago
- Experiments around a simple idea for inducing multiple hierarchical predictive model within a GPT☆224Updated last year
- ☆416Updated 2 years ago
- An unsupervised model merging algorithm for Transformers-based language models.☆108Updated last year
- Landmark Attention: Random-Access Infinite Context Length for Transformers☆426Updated 2 years ago
- ☆81Updated last year
- A pipeline for LLM knowledge distillation☆112Updated 10 months ago
- Fast modular code to create and train cutting edge LLMs☆68Updated last year
- an implementation of Self-Extend, to expand the context window via grouped attention☆119Updated 2 years ago
- Command-line script for inferencing from models such as MPT-7B-Chat☆100Updated 2 years ago
- PyTorch implementation of Infini-Transformer from "Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention…☆294Updated last year
- ☆535Updated 2 years ago
- Spherical Merge Pytorch/HF format Language Models with minimal feature loss.☆144Updated 2 years ago