lamm-mit / LLM-finetuning
☆24Updated 8 months ago
Alternatives and similar repositories for LLM-finetuning
Users that are interested in LLM-finetuning are comparing it to the libraries listed below
Sorting:
- minimal GRPO implementation from scratch☆90Updated 2 months ago
- ☆65Updated 2 months ago
- ☆72Updated last week
- Agent framework for constructing language model agents and training on constructive tasks.☆84Updated this week
- Lean implementation of various multi-agent LLM methods, including Iteration of Thought (IoT)☆111Updated 3 months ago
- Official Implementation of "Multi-Head RAG: Solving Multi-Aspect Problems with LLMs"☆208Updated 6 months ago
- Agentic Reward Modeling: Integrating Human Preferences with Verifiable Correctness Signals for Reliable Reward Systems☆90Updated 2 months ago
- ☆111Updated 8 months ago
- Matrix (Multi-Agent daTa geneRation Infra and eXperimentation framework) is a versatile engine for multi-agent conversational data genera…☆53Updated this week
- The official implementation of the paper "Chain-of-Tools: Utilizing Massive Unseen Tools in the CoT Reasoning of Frozen Language Models".☆70Updated last month
- ☆201Updated 2 months ago
- Repository for Zochi's Research☆60Updated last month
- Complete implementation of Llama2 with/without KV cache & inference 🚀☆46Updated 11 months ago
- ☆90Updated 2 weeks ago
- Resources for our paper: "Agent-R: Training Language Model Agents to Reflect via Iterative Self-Training"☆132Updated last month
- ICLR 2025 - official implementation for "I-Con: A Unifying Framework for Representation Learning"☆85Updated this week
- Fine-Tuning Llama3-8B LLM in a multi-GPU environment using DeepSpeed☆17Updated 11 months ago
- Chain of Experts (CoE) enables communication between experts within Mixture-of-Experts (MoE) models☆162Updated this week
- A comprehensive repository of reasoning tasks for Medical LLMs (and beyond)☆121Updated 8 months ago
- Train your own SOTA deductive reasoning model☆92Updated 2 months ago
- Automated Hypothesis Testing with Agentic Sequential Falsifications☆186Updated this week
- Source code of "How to Correctly do Semantic Backpropagation on Language-based Agentic Systems" 🤖☆67Updated 5 months ago
- Tina: Tiny Reasoning Models via LoRA☆213Updated this week
- accompanying material for sleep-time compute paper☆83Updated 2 weeks ago
- ☆150Updated 2 months ago
- X-LoRA: Mixture of LoRA Experts☆223Updated 9 months ago
- A simplified implementation for experimenting with RLVR on GSM8K, This repository provides a starting point for exploring reasoning.☆89Updated 3 months ago
- Graph-Aware Attention for Adaptive Dynamics in Transformers☆59Updated 4 months ago
- Unofficial implementation of https://arxiv.org/pdf/2407.14679☆44Updated 8 months ago
- A single repo with all scripts and utils to train / fine-tune the Mamba model with or without FIM☆54Updated last year