☆48Aug 29, 2024Updated last year
Alternatives and similar repositories for optimized_hf_llama_class_for_training
Users that are interested in optimized_hf_llama_class_for_training are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Easily run PyTorch on multiple GPUs & machines☆60Jan 8, 2026Updated 2 months ago
- supporting pytorch FSDP for optimizers☆84Dec 8, 2024Updated last year
- GPT* - Training faster small transformers using ALiBi, Parallel Residual Connections and more!☆21Oct 29, 2022Updated 3 years ago
- Code for the examples presented in the talk "Training a Llama in your backyard: fine-tuning very large models on consumer hardware" given…☆15Oct 16, 2023Updated 2 years ago
- Utilities for Training Very Large Models☆58Sep 25, 2024Updated last year
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- ☆10Dec 21, 2024Updated last year
- Tool to apply Legal Matter Specification Standard (LMSS) to documents☆12Aug 15, 2024Updated last year
- ☆11Oct 3, 2021Updated 4 years ago
- ☆12Apr 29, 2024Updated last year
- Full finetuning of large language models without large memory requirements☆94Sep 22, 2025Updated 6 months ago
- ☆80Jun 5, 2024Updated last year
- This is project for korean auto spacing☆12Aug 3, 2020Updated 5 years ago
- ☆21Mar 23, 2022Updated 4 years ago
- ☆14May 3, 2022Updated 3 years ago
- NordVPN Threat Protection Pro™ • AdTake your cybersecurity to the next level. Block phishing, malware, trackers, and ads. Lightweight app that works with all browsers.
- baikal.ai's pre-trained BERT models: descriptions and sample codes☆12Jun 24, 2021Updated 4 years ago
- AI model designed to test the effectiveness in handling external ethical attacks.☆11Feb 9, 2026Updated last month
- ☆13Jan 22, 2025Updated last year
- ☆14Dec 21, 2025Updated 3 months ago
- [ICLR 2026] Quantile Advantage Estimation for Entropy-Safe Reasoning☆24Oct 14, 2025Updated 5 months ago
- Paper Review about Speech Recognition · NLP☆10Mar 25, 2021Updated 5 years ago
- Fast, Modern, and Low Precision PyTorch Optimizers☆129Dec 29, 2025Updated 3 months ago
- Codebase for Instruction Following without Instruction Tuning☆36Sep 24, 2024Updated last year
- a pipeline for using api calls to agnostically convert unstructured data into structured training data☆32Sep 22, 2024Updated last year
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- ☆71Jul 11, 2024Updated last year
- ☆20Jul 12, 2023Updated 2 years ago
- Energetic GraphNeural Networks (EGNN) implementation based on Dirichlet Energy Constrained Learning.☆27Nov 1, 2021Updated 4 years ago
- End-to-End Korean Automatic Speech Recognition leveraging PyTorch and Hydra.☆10Jan 21, 2022Updated 4 years ago
- An efficent implementation of the method proposed in "The Era of 1-bit LLMs"☆155Oct 15, 2024Updated last year
- Collection of autoregressive model implementation☆85Feb 23, 2026Updated last month
- Optimizing Causal LMs through GRPO with weighted reward functions and automated hyperparameter tuning using Optuna☆59Oct 18, 2025Updated 5 months ago
- Fast and Slow Generating: An Empirical Study on Large and Small Language Models Collaborative Decoding.☆13Nov 19, 2024Updated last year
- [WWW2022] Geometric Graph Representation Learning via Maximizing Rate Reduction☆26May 27, 2022Updated 3 years ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- We can crawl NaverBlog, Twitter, Youtube!!☆14Sep 13, 2019Updated 6 years ago
- A fusion of a linear layer and a cross entropy loss, written for pytorch in triton.☆75Aug 2, 2024Updated last year
- Few Shot Learning using EleutherAI's GPT-Neo an Open-source version of GPT-3☆18Jul 8, 2021Updated 4 years ago
- [NAACL 2025] Representing Rule-based Chatbots with Transformers☆23Feb 9, 2025Updated last year
- 청와대 국민청원 데이터 아카이브☆15Aug 29, 2020Updated 5 years ago
- No code solution for training tabular models☆35Jan 25, 2026Updated 2 months ago
- Positional Skip-wise Training for Efficient Context Window Extension of LLMs to Extremely Length (ICLR 2024)☆209May 20, 2024Updated last year