☆48Aug 29, 2024Updated last year
Alternatives and similar repositories for optimized_hf_llama_class_for_training
Users that are interested in optimized_hf_llama_class_for_training are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- supporting pytorch FSDP for optimizers☆84Dec 8, 2024Updated last year
- GPT* - Training faster small transformers using ALiBi, Parallel Residual Connections and more!☆21Oct 29, 2022Updated 3 years ago
- Code for the examples presented in the talk "Training a Llama in your backyard: fine-tuning very large models on consumer hardware" given…☆15Oct 16, 2023Updated 2 years ago
- ☆25Sep 3, 2025Updated 7 months ago
- Utilities for Training Very Large Models☆59Sep 25, 2024Updated last year
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- ☆10Dec 21, 2024Updated last year
- ☆11Oct 3, 2021Updated 4 years ago
- ☆12Apr 29, 2024Updated last year
- Full finetuning of large language models without large memory requirements☆94Sep 22, 2025Updated 6 months ago
- This is project for korean auto spacing☆12Aug 3, 2020Updated 5 years ago
- A chat implementation for FastHTML☆12Sep 14, 2025Updated 7 months ago
- baikal.ai's pre-trained BERT models: descriptions and sample codes☆12Jun 24, 2021Updated 4 years ago
- AI model designed to test the effectiveness in handling external ethical attacks.☆11Feb 9, 2026Updated 2 months ago
- ☆13Jan 22, 2025Updated last year
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- [ICLR 2026] Quantile Advantage Estimation for Entropy-Safe Reasoning☆24Oct 14, 2025Updated 6 months ago
- Deploy KoGPT with Triton Inference Server☆14Nov 18, 2022Updated 3 years ago
- ☆14May 3, 2022Updated 3 years ago
- Fast, Modern, and Low Precision PyTorch Optimizers☆129Dec 29, 2025Updated 3 months ago
- Paper Review about Speech Recognition · NLP☆10Mar 25, 2021Updated 5 years ago
- a pipeline for using api calls to agnostically convert unstructured data into structured training data☆32Sep 22, 2024Updated last year
- Reinforcement Learning Agent that plays Heroic - Magic Duel☆15Jun 23, 2020Updated 5 years ago
- ☆71Jul 11, 2024Updated last year
- ☆20Jul 12, 2023Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Compiling useful links, papers, benchmarks, ideas, etc.☆46Mar 16, 2025Updated last year
- 음성인식과 신호처리☆14Sep 12, 2021Updated 4 years ago
- End-to-End Korean Automatic Speech Recognition leveraging PyTorch and Hydra.☆10Jan 21, 2022Updated 4 years ago
- Collection of autoregressive model implementation☆85Feb 23, 2026Updated last month
- On Efficient Language and Vision Assistants for Visually-Situated Natural Language Understanding: What Matters in Reading and Reasoning, …☆19Mar 13, 2026Updated last month
- Fast and Slow Generating: An Empirical Study on Large and Small Language Models Collaborative Decoding.☆13Nov 19, 2024Updated last year
- [WWW2022] Geometric Graph Representation Learning via Maximizing Rate Reduction☆26May 27, 2022Updated 3 years ago
- We can crawl NaverBlog, Twitter, Youtube!!☆14Sep 13, 2019Updated 6 years ago
- A fusion of a linear layer and a cross entropy loss, written for pytorch in triton.☆75Aug 2, 2024Updated last year
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Unofficial implementation of AlpaGasus☆94Sep 23, 2023Updated 2 years ago
- An implementation of model parallel autoregressive transformers on GPUs, based on the DeepSpeed library.☆21Nov 28, 2022Updated 3 years ago
- ☆94Oct 5, 2023Updated 2 years ago
- [NAACL 2025] Representing Rule-based Chatbots with Transformers☆23Feb 9, 2025Updated last year
- No code solution for training tabular models☆35Jan 25, 2026Updated 2 months ago
- 청와대 국민청원 데이터 아카이브☆15Aug 29, 2020Updated 5 years ago
- Positional Skip-wise Training for Efficient Context Window Extension of LLMs to Extremely Length (ICLR 2024)☆209May 20, 2024Updated last year