ksm26 / Pretraining-LLMsLinks
Master the essential steps of pretraining large language models (LLMs). Learn to create high-quality datasets, configure model architectures, execute training runs, and assess model performance for efficient and effective LLM pretraining.
☆24Updated last year
Alternatives and similar repositories for Pretraining-LLMs
Users that are interested in Pretraining-LLMs are comparing it to the libraries listed below
Sorting:
- Building a 2.3M-parameter LLM from scratch with LLaMA 1 architecture.☆197Updated last year
- LLaMA 3 is one of the most promising open-source model after Mistral, we will recreate it's architecture in a simpler manner.☆200Updated last year
- This is an implementation of the paper: Searching for Best Practices in Retrieval-Augmented Generation (EMNLP2024)☆344Updated last year
- ☆30Updated last year
- This repository contains a custom implementation of the BERT model, fine-tuned for specific tasks, along with an implementation of Low Ra…☆78Updated 2 years ago
- ☆82Updated last year
- A set of scripts and notebooks on LLM finetunning and dataset creation☆116Updated last year
- ☆92Updated last week
- Implementation of 12 AI agents evaluation techniques☆35Updated 6 months ago
- ☆55Updated 5 months ago
- 1st Place Solution for LLM - Detect AI Generated Text Kaggle Competition☆211Updated last year
- Multimodal RAG using Langchain☆58Updated 2 years ago
- ☆27Updated last year
- Medical RAG QA App using Meditron 7B LLM, Qdrant Vector Database, and PubMedBERT Embedding Model.☆62Updated 2 years ago
- Unofficial implementation of https://arxiv.org/pdf/2407.14679☆53Updated last year
- Repository for the paper "MALADE: Orchestration of LLM-powered Agents with Retrieval Augmented Generation for Pharmacovigilance"☆23Updated 11 months ago
- Collection of resources for finetuning Large Language Models (LLMs).☆111Updated last year
- Advanced Retrieval-Augmented Generation (RAG) through practical notebooks, using the power of the Langchain, OpenAI GPTs ,META LLAMA3 ,A…☆443Updated last year
- Curated list of weekly published LLM papers☆200Updated 3 weeks ago
- RAFT, or Retrieval-Augmented Fine-Tuning, is a method comprising of a fine-tuning and a RAG-based retrieval phase. It is particularly sui…☆164Updated last year
- Collection of resources for RL and Reasoning☆27Updated last year
- Maximizing the Performance of a Simple RAG using RL☆90Updated 10 months ago
- [ICLR'25] ApolloMoE: Efficiently Democratizing Medical LLMs for 50 Languages via a Mixture of Language Family Experts☆52Updated last year
- ☆107Updated 10 months ago
- A comprehensive repository of reasoning tasks for Medical LLMs (and beyond)☆132Updated last year
- Blended RAG: Improving RAG (Retriever-Augmented Generation) Accuracy with Semantic Search and Hybrid Query-Based Retrievers☆84Updated 8 months ago
- LLM (Large Language Model) FineTuning☆565Updated 10 months ago
- A code repository that cointains all the code for finetuning some of the popular LLMs on medical data☆71Updated last year
- Synthetic Data Generation using LLM via Argilla, Distilabel, ChatGPT, etc.☆30Updated last year
- An open-source recreation of the AgentInstruct agentic workflow for synthetic data generation☆24Updated 9 months ago