mistralai-sf24 / hackathon
☆446Updated 10 months ago
Alternatives and similar repositories for hackathon:
Users that are interested in hackathon are comparing it to the libraries listed below
- Accelerate your Hugging Face Transformers 7.6-9x. Native to Hugging Face and PyTorch.☆688Updated 5 months ago
- Automatically evaluate your LLMs in Google Colab☆590Updated 9 months ago
- ☆930Updated 2 weeks ago
- A library for making RepE control vectors☆551Updated last month
- ☆679Updated 2 weeks ago
- Visualize the intermediate output of Mistral 7B☆339Updated 3 weeks ago
- Train Models Contrastively in Pytorch☆639Updated this week
- Toolkit for attaching, training, saving and loading of new heads for transformer models☆262Updated 2 weeks ago
- Banishing LLM Hallucinations Requires Rethinking Generalization☆270Updated 7 months ago
- Reaching LLaMA2 Performance with 0.1M Dollars☆973Updated 6 months ago
- Fine-tune mistral-7B on 3090s, a100s, h100s☆705Updated last year
- Official implementation of Half-Quadratic Quantization (HQQ)☆748Updated this week
- Code for Quiet-STaR☆713Updated 5 months ago
- ☆502Updated 5 months ago
- Website for hosting the Open Foundation Models Cheat Sheet.☆262Updated 7 months ago
- Fully fine-tune large models like Mistral, Llama-2-13B, or Qwen-14B completely for free☆230Updated 3 months ago
- Manage scalable open LLM inference endpoints in Slurm clusters☆252Updated 7 months ago
- The official implementation of Self-Play Fine-Tuning (SPIN)☆1,117Updated 9 months ago
- ☆412Updated last year
- Implementation of the training framework proposed in Self-Rewarding Language Model, from MetaAI☆1,361Updated 10 months ago
- Official implementation of "Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling"☆841Updated this week
- Training LLMs with QLoRA + FSDP☆1,451Updated 3 months ago
- [ICML'24 Spotlight] LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning☆640Updated 8 months ago
- Extend existing LLMs way beyond the original training length with constant memory usage, without retraining☆687Updated 10 months ago
- ☆496Updated 3 months ago
- This is our own implementation of 'Layer Selective Rank Reduction'☆233Updated 8 months ago
- A bagel, with everything.☆316Updated 10 months ago
- The repository for the code of the UltraFastBERT paper☆517Updated 10 months ago
- ☆806Updated 5 months ago
- Steer LLM outputs towards a certain topic/subject and enhance response capabilities using activation engineering by adding steering vecto…☆222Updated this week