Scripts for fine-tuning Llama2 via SFT and DPO.
☆206Aug 14, 2023Updated 2 years ago
Alternatives and similar repositories for llama2-fine-tune
Users that are interested in llama2-fine-tune are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Sakura-SOLAR-DPO: Merge, SFT, and DPO☆116Dec 30, 2023Updated 2 years ago
- ☆10Jan 20, 2024Updated 2 years ago
- Llama2-SFT, Llama-2-7B微调(transformers)/LORA(peft)/推理☆27Jul 26, 2023Updated 2 years ago
- 모두의 말뭉치 데이터를 분석에 편리한 형태로 변환하는 기능을 제공합니다.☆11Mar 2, 2022Updated 4 years ago
- ☆11Oct 3, 2021Updated 4 years ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- An unofficial implementation of SOLAR-10.7B model and the newly proposed interlocked-DUS(iDUS) implementation and experiment details.☆14Mar 20, 2024Updated 2 years ago
- ☆11Apr 2, 2024Updated 2 years ago
- 금융 도메인에 특화된 한국어 임베딩 모델☆22Aug 8, 2024Updated last year
- For the rlhf learning environment of Koreans☆25Sep 25, 2023Updated 2 years ago
- The most comprehensive and accurate LLM jailbreak attack benchmark by far☆22Mar 22, 2025Updated last year
- BERT score for text generation☆12Jan 15, 2025Updated last year
- Reference implementation for DPO (Direct Preference Optimization)☆2,875Aug 11, 2024Updated last year
- SIB-200: A Simple, Inclusive, and Big Evaluation Dataset for Topic Classification in 200+ Languages and Dialects☆23Jan 26, 2025Updated last year
- Official repository for KoMT-Bench built by LG AI Research☆71Aug 8, 2024Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- Korean Nested Named Entity Corpus☆20May 13, 2023Updated 2 years ago
- Official implementation of SIGIR 2022 Paper "Task-Oriented Dialogue System as Natural Language Generation".☆14Apr 6, 2022Updated 4 years ago
- DSBA code study☆30Nov 7, 2023Updated 2 years ago
- Exploring limitations of LLM-as-a-judge☆20Aug 17, 2024Updated last year
- incremental symbol learning for natural language understanding☆10Jun 12, 2023Updated 2 years ago
- A loader that lets you try running LLMs built for WebGPU.☆29Dec 20, 2023Updated 2 years ago
- Korean text data preprocess toolkit for NLP☆18Jun 11, 2019Updated 6 years ago
- ☆31Oct 15, 2021Updated 4 years ago
- ☆23Nov 26, 2024Updated last year
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- The git repository of Modular Prompted Chatbot paper☆35May 24, 2023Updated 2 years ago
- Code for the experiments in the ACL 2020 paper "Estimating predictive uncertainty for rumour verification models"☆11May 15, 2020Updated 5 years ago
- 한국어 어휘 의미 분석 모델☆23Apr 4, 2022Updated 4 years ago
- Robust recipes to align language models with human and AI preferences☆5,551Apr 2, 2026Updated last week
- Korean Named Entity Corpus☆25May 12, 2023Updated 2 years ago
- Monitoring of a GPU system sending either Slack or Mattermost messages via webhooks☆12Jul 20, 2017Updated 8 years ago
- The multilingual language model for Switzerland☆28Jan 19, 2024Updated 2 years ago
- Simple extension for text-generation-webui that injects recent conversation history into the negative prompt with the goal of minimizing …☆32Nov 20, 2023Updated 2 years ago
- ☆19Nov 7, 2023Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- Official implementation of Bootstrapping Language Models via DPO Implicit Rewards☆47Apr 15, 2025Updated 11 months ago
- Testing DeepSpeed integration in 🤗 Accelerate☆11Jun 28, 2022Updated 3 years ago
- [Neural Networks 2025] The official code for the paper "MNet: A Multi-Scale Network for Visible Watermark Removal."☆17Jun 16, 2025Updated 9 months ago
- This repository contains papers for a comprehensive survey on accelerated generation techniques in Large Language Models (LLMs).☆11May 24, 2024Updated last year
- [ICML 2024] Junk DNA Hypothesis: A Task-Centric Angle of LLM Pre-trained Weights through Sparsity; Lu Yin*, Ajay Jaiswal*, Shiwei Liu, So…☆16Apr 21, 2025Updated 11 months ago
- Natural Language Processing Tasks and Examples.☆62Aug 17, 2022Updated 3 years ago
- Joint Optimization of Cascade Ranking Models (WSDM 19)☆13Jun 21, 2022Updated 3 years ago