fshnkarimi / Fine-tuning-an-LLM-using-LoRA
π Text Classification with LoRA (Low-Rank Adaptation) of Language Models - Efficiently fine-tune large language models for text classification tasks using the Stanford Sentiment Treebank (SST-2) dataset and the LoRA technique.
β32Updated 11 months ago
Related projects: β
- β118Updated 5 months ago
- β105Updated this week
- A full pipeline to finetune Alpaca LLM with LoRA and RLHF on consumer hardware. Implementation of RLHF (Reinforcement Learning with Humanβ¦β55Updated last year
- β30Updated 4 months ago
- Truth Forest: Toward Multi-Scale Truthfulness in Large Language Models through Intervention without Tuningβ40Updated 9 months ago
- β19Updated 2 months ago
- Astraios: Parameter-Efficient Instruction Tuning Code Language Modelsβ57Updated 5 months ago
- Plug in and play implementation of " Textbooks Are All You Need", ready for training, inference, and dataset generationβ75Updated last year
- Aligning with Human Judgement: The Role of Pairwise Preference in Large Language Model Evaluators (Liu et al.; arXiv preprint arXiv:2403.β¦β34Updated 2 months ago
- Minimal implementation of the Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models paper (ArXiv 20232401.01335)β25Updated 6 months ago
- Small and Efficient Mathematical Reasoning LLMsβ69Updated 7 months ago
- Public code repo for paper "SaySelf: Teaching LLMs to Express Confidence with Self-Reflective Rationales"β82Updated 2 months ago
- Codebase accompanying the Summary of a Haystack paper.β65Updated 2 months ago
- β75Updated last month
- Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignmentβ39Updated 3 weeks ago
- Code repo for "Agent Instructs Large Language Models to be General Zero-Shot Reasoners"β68Updated last week
- LangChain, Llama2-Chat, and zero- and few-shot prompting are used to generate synthetic datasets for IR and RAG system evaluationβ32Updated 9 months ago
- Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absoluteβ¦β48Updated 2 months ago
- Evaluation and analysis code for LLM360β75Updated 3 months ago
- Lightweight demos for finetuning LLMs. Powered by π€ transformers and open-source datasets.β64Updated 2 months ago
- Code for the paper "Learn Beyond The Answer: Training Language Models with Reflection for Mathematical Reasoning"β30Updated 3 months ago
- [NeurIPS 2023] This is the code for the paper `Large Language Model as Attributed Training Data Generator: A Tale of Diversity and Bias`.β133Updated 10 months ago
- β52Updated 7 months ago
- β31Updated 3 months ago
- Code and data for "StructLM: Towards Building Generalist Models for Structured Knowledge Grounding" (COLM 2024)β67Updated 2 months ago
- Implementation of "LM-Infinite: Simple On-the-Fly Length Generalization for Large Language Models"β42Updated last week
- This is the code repo for our paper "Revealing the Treasures of Knowledge via Active Learning".β88Updated 6 months ago
- EvolKit is an innovative framework designed to automatically enhance the complexity of instructions used for fine-tuning Large Language Mβ¦β118Updated last week
- Benchmarking LLMs' Emotional Alignment with Humansβ60Updated last month
- β12Updated 6 months ago