bayjarvis / llm
Fine-tuning, DPO, RLHF, RLAIF on LLMs - Zephyr 7B GPTQ with 4-Bit Quantization, Mistral-7B-GPTQ
☆12Updated last year
Alternatives and similar repositories for llm
Users that are interested in llm are comparing it to the libraries listed below
Sorting:
- A public implementation of the ReLoRA pretraining method, built on Lightning-AI's Pytorch Lightning suite.☆33Updated last year
- Code, results and other artifacts from the paper introducing the WildChat-50m dataset and the Re-Wild model family.☆29Updated last month
- Zephyr 7B beta RAG Demo inside a Gradio app powered by BGE Embeddings, ChromaDB, and Zephyr 7B Beta LLM.☆34Updated last year
- ☆24Updated last year
- Finetune any model on HF in less than 30 seconds☆58Updated last month
- ☆48Updated 6 months ago
- Zeus LLM Trainer is a rewrite of Stanford Alpaca aiming to be the trainer for all Large Language Models☆69Updated last year
- Implementation of the Mamba SSM with hf_integration.☆56Updated 8 months ago
- Experimental sampler to make LLMs more creative☆31Updated last year
- Unleash the full potential of exascale LLMs on consumer-class GPUs, proven by extensive benchmarks, with no long-term adjustments and min…☆26Updated 6 months ago
- Verifiers for LLM Reinforcement Learning☆50Updated last month
- Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absolute…☆49Updated 10 months ago
- Repository containing the SPIN experiments on the DIBT 10k ranked prompts☆24Updated last year
- Latent Large Language Models☆18Updated 8 months ago
- ☆25Updated 7 months ago
- 🐜🔧 A minimalistic tool to fine-tune your LLMs☆18Updated last year
- EMNLP 2024 "Re-reading improves reasoning in large language models". Simply repeating the question to get bidirectional understanding for…☆25Updated 5 months ago
- ☆12Updated 2 weeks ago
- ☆13Updated 5 months ago
- Small and Efficient Mathematical Reasoning LLMs☆71Updated last year
- ☆26Updated last year
- Fine tune Gemma 3 on an object detection task☆20Updated this week
- implementation of https://arxiv.org/pdf/2312.09299☆20Updated 10 months ago
- The Benefits of a Concise Chain of Thought on Problem Solving in Large Language Models☆22Updated 5 months ago
- Set of scripts to finetune LLMs☆37Updated last year
- Reward Model framework for LLM RLHF☆61Updated last year
- ☆43Updated 3 months ago
- Fine-tune and quantize Llama-2-like models to generate Python code using QLoRA, Axolot,..☆64Updated last year
- The code implementation of MAGDi: Structured Distillation of Multi-Agent Interaction Graphs Improves Reasoning in Smaller Language Models…☆34Updated last year
- Based on the tree of thoughts paper☆48Updated last year