JD-P / minihf
MiniHF is an inference, human preference data collection, and fine-tuning tool for local language models. It is intended to help the user develop their prompts into full models.
☆151Updated this week
Related projects ⓘ
Alternatives and complementary repositories for minihf
- Comprehensive analysis of difference in performance of QLora, Lora, and Full Finetunes.☆81Updated last year
- Full finetuning of large language models without large memory requirements☆93Updated 10 months ago
- An easy-to-understand framework for LLM samplers that rewind and revise generated tokens☆113Updated 3 weeks ago
- ☆49Updated 8 months ago
- ☆91Updated last year
- Aidan Bench attempts to measure <big_model_smell> in LLMs.☆96Updated this week
- ☆93Updated last month
- look how they massacred my boy☆58Updated last month
- Low-Rank adapter extraction for fine-tuned transformers model☆162Updated 6 months ago
- ☆57Updated 11 months ago
- The history files when recording human interaction while solving ARC tasks☆95Updated this week
- smolLM with Entropix sampler on pytorch☆139Updated 2 weeks ago
- Just a bunch of benchmark logs for different LLMs☆114Updated 3 months ago
- ☆104Updated 8 months ago
- Simple Transformer in Jax☆119Updated 4 months ago
- ☆118Updated 3 months ago
- Code repository for the c-BTM paper☆105Updated last year
- ☆48Updated last year
- an implementation of Self-Extend, to expand the context window via grouped attention☆118Updated 10 months ago
- The simplest, fastest repository for training/finetuning medium-sized GPTs.☆84Updated last week
- This is our own implementation of 'Layer Selective Rank Reduction'☆232Updated 5 months ago
- Cerule - A Tiny Mighty Vision Model☆67Updated 2 months ago
- ☆74Updated 3 weeks ago
- inference code for mixtral-8x7b-32kseqlen☆98Updated 11 months ago
- QLoRA: Efficient Finetuning of Quantized LLMs☆77Updated 7 months ago
- An unsupervised model merging algorithm for Transformers-based language models.☆100Updated 6 months ago
- A Collection of Pydantic Models to Abstract IRL☆15Updated this week
- Erasing concepts from neural representations with provable guarantees☆209Updated last week
- Steer LLM outputs towards a certain topic/subject and enhance response capabilities using activation engineering by adding steering vecto…☆203Updated 6 months ago
- A puzzle to learn about prompting☆121Updated last year