hpcaitech / Titans
A collection of models built with ColossalAI
☆32Updated 2 years ago
Alternatives and similar repositories for Titans:
Users that are interested in Titans are comparing it to the libraries listed below
- Scalable PaLM implementation of PyTorch☆190Updated 2 years ago
- A memory efficient DLRM training solution using ColossalAI☆104Updated 2 years ago
- Fast LLM Training CodeBase With dynamic strategy choosing [Deepspeed+Megatron+FlashAttention+CudaFusionKernel+Compiler];☆37Updated last year
- An Experiment on Dynamic NTK Scaling RoPE☆64Updated last year
- Repository for analysis and experiments in the BigCode project.☆118Updated last year
- An Implementation of "Orca: Progressive Learning from Complex Explanation Traces of GPT-4"☆43Updated 6 months ago
- ☆106Updated last year
- Official implementation for 'Extending LLMs’ Context Window with 100 Samples'☆77Updated last year
- MultilingualShareGPT, the free multi-language corpus for LLM training☆72Updated 2 years ago
- ☆75Updated last month
- LMTuner: Make the LLM Better for Everyone☆35Updated last year
- This project studies the performance and robustness of language models and task-adaptation methods.☆150Updated 11 months ago
- Distributed IO-aware Attention algorithm☆20Updated 8 months ago
- Large Scale Distributed Model Training strategy with Colossal AI and Lightning AI☆57Updated last year
- Async pipelined version of Verl☆66Updated last month
- ☆30Updated last year
- a Fine-tuned LLaMA that is Good at Arithmetic Tasks☆177Updated last year
- Ouroboros: Speculative Decoding with Large Model Enhanced Drafting (EMNLP 2024 main)☆103Updated last month
- Implementation of NAACL 2024 Outstanding Paper "LM-Infinite: Simple On-the-Fly Length Generalization for Large Language Models"☆142Updated last month
- NAACL '24 (Best Demo Paper RunnerUp) / MlSys @ NeurIPS '23 - RedCoast: A Lightweight Tool to Automate Distributed Training and Inference☆65Updated 5 months ago
- ☆98Updated 7 months ago
- Data preparation code for CrystalCoder 7B LLM☆44Updated last year
- code for Scaling Laws of RoPE-based Extrapolation☆73Updated last year
- Source code for ACL 2023 paper Decoder Tuning: Efficient Language Understanding as Decoding☆49Updated last year
- [ICLR2025] Breaking Throughput-Latency Trade-off for Long Sequences with Speculative Decoding☆116Updated 5 months ago
- Inspired by google c4, here is a series of colossal clean data cleaning scripts focused on CommonCrawl data processing. Including Chinese…☆124Updated last year
- The aim of this repository is to utilize LLaMA to reproduce and enhance the Stanford Alpaca☆97Updated 2 years ago
- distill chatGPT coding ability into small model (1b)☆29Updated last year
- Python tools for processing the stackexchange data dumps into a text dataset for Language Models☆82Updated last year
- A unified tokenization tool for Images, Chinese and English.☆152Updated 2 years ago