SeunghyunSEO / optimized_hf_llama_class_for_trainingView external linksLinks
☆48Aug 29, 2024Updated last year
Alternatives and similar repositories for optimized_hf_llama_class_for_training
Users that are interested in optimized_hf_llama_class_for_training are comparing it to the libraries listed below
Sorting:
- Easily run PyTorch on multiple GPUs & machines☆57Jan 8, 2026Updated last month
- This is project for korean auto spacing☆12Aug 3, 2020Updated 5 years ago
- Paper Review about Speech Recognition · NLP☆10Mar 25, 2021Updated 4 years ago
- A chat implementation for FastHTML☆11Sep 14, 2025Updated 5 months ago
- AI model designed to test the effectiveness in handling external ethical attacks.☆11Feb 9, 2026Updated last week
- Tool to apply Legal Matter Specification Standard (LMSS) to documents☆12Aug 15, 2024Updated last year
- ☆10Dec 21, 2024Updated last year
- Deploy KoGPT with Triton Inference Server☆14Nov 18, 2022Updated 3 years ago
- ☆13Jan 22, 2025Updated last year
- ☆11Oct 3, 2021Updated 4 years ago
- Code for the examples presented in the talk "Training a Llama in your backyard: fine-tuning very large models on consumer hardware" given…☆15Oct 16, 2023Updated 2 years ago
- End-to-End Korean Automatic Speech Recognition leveraging PyTorch and Hydra.☆10Jan 21, 2022Updated 4 years ago
- ☆80Jun 5, 2024Updated last year
- [ICLR 2026] Quantile Advantage Estimation for Entropy-Safe Reasoning☆23Oct 14, 2025Updated 4 months ago
- ☆14Oct 18, 2023Updated 2 years ago
- ☆13Dec 21, 2025Updated last month
- 음성인식과 신호처리☆14Sep 12, 2021Updated 4 years ago
- ☆10May 22, 2023Updated 2 years ago
- We can crawl NaverBlog, Twitter, Youtube!!☆14Sep 13, 2019Updated 6 years ago
- ☆13Apr 22, 2024Updated last year
- ☆14May 3, 2022Updated 3 years ago
- Full finetuning of large language models without large memory requirements☆94Sep 22, 2025Updated 4 months ago
- Fast, Modern, and Low Precision PyTorch Optimizers☆124Dec 29, 2025Updated last month
- PLM: Efficient Peripheral Language Models Hardware-Co-Designed for Ubiquitous Computing☆21Mar 18, 2025Updated 10 months ago
- 청와대 국민청원 데이터 아카이브☆15Aug 29, 2020Updated 5 years ago
- [NAACL 2025] Representing Rule-based Chatbots with Transformers☆23Feb 9, 2025Updated last year
- ☆21Mar 23, 2022Updated 3 years ago
- ☆55Updated this week
- Official repository for ICML 2024 paper "MoRe Fine-Tuning with 10x Fewer Parameters"☆22Oct 14, 2025Updated 4 months ago
- Decoding Attention is specially optimized for MHA, MQA, GQA and MLA using CUDA core for the decoding stage of LLM inference.☆46Jun 11, 2025Updated 8 months ago
- ☆21Feb 21, 2022Updated 3 years ago
- Few Shot Learning using EleutherAI's GPT-Neo an Open-source version of GPT-3☆18Jul 8, 2021Updated 4 years ago
- An implementation of model parallel autoregressive transformers on GPUs, based on the DeepSpeed library.☆21Nov 28, 2022Updated 3 years ago
- supporting pytorch FSDP for optimizers☆84Dec 8, 2024Updated last year
- Local Attention - Flax module for Jax☆22May 26, 2021Updated 4 years ago
- Korean text data preprocess toolkit for NLP☆18Jun 11, 2019Updated 6 years ago
- ☆33Jun 3, 2025Updated 8 months ago
- ☆20Jul 12, 2023Updated 2 years ago
- Qwen-WisdomVast is a large model trained on 1 million high-quality Chinese multi-turn SFT data, 200,000 English multi-turn SFT data, and …☆18Apr 12, 2024Updated last year