hpcaitech / Titans
A collection of models built with ColossalAI
☆32Updated 2 years ago
Alternatives and similar repositories for Titans:
Users that are interested in Titans are comparing it to the libraries listed below
- An Experiment on Dynamic NTK Scaling RoPE☆63Updated last year
- Fast LLM Training CodeBase With dynamic strategy choosing [Deepspeed+Megatron+FlashAttention+CudaFusionKernel+Compiler];☆36Updated last year
- Scalable PaLM implementation of PyTorch☆190Updated 2 years ago
- ☆104Updated last year
- A memory efficient DLRM training solution using ColossalAI☆104Updated 2 years ago
- An Implementation of "Orca: Progressive Learning from Complex Explanation Traces of GPT-4"☆43Updated 6 months ago
- Simple and efficient pytorch-native transformer training and inference (batched)☆73Updated last year
- Large Scale Distributed Model Training strategy with Colossal AI and Lightning AI☆57Updated last year
- A plug-in of Microsoft DeepSpeed to fix the bug of DeepSpeed pipeline☆26Updated 4 years ago
- ☆98Updated 6 months ago
- distill chatGPT coding ability into small model (1b)☆28Updated last year
- setup the env for vllm users☆16Updated last year
- ☆81Updated last year
- Code for preprint "Metadata Conditioning Accelerates Language Model Pre-training (MeCo)"☆36Updated 3 weeks ago
- A Python implementation of Toolformer using Huggingface Transformers☆15Updated 2 years ago
- Official implementation for 'Extending LLMs’ Context Window with 100 Samples'☆76Updated last year
- Ouroboros: Speculative Decoding with Large Model Enhanced Drafting (EMNLP 2024 main)☆101Updated 3 weeks ago
- A GPT-based generative LM for combined text and math formulas, leveraging tree-based formula encoding.☆35Updated last year
- NaturalCodeBench (Findings of ACL 2024)☆62Updated 6 months ago
- [AAAI 2024] Investigating the Effectiveness of Task-Agnostic Prefix Prompt for Instruction Following☆79Updated 7 months ago
- LLMs as Collaboratively Edited Knowledge Bases☆45Updated last year
- Using FlexAttention to compute attention with different masking patterns☆43Updated 6 months ago
- ☆30Updated last year
- An experimental implementation of the retrieval-enhanced language model☆74Updated 2 years ago
- Implementation of NAACL 2024 Outstanding Paper "LM-Infinite: Simple On-the-Fly Length Generalization for Large Language Models"☆142Updated last month
- Implementation of the paper: "Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention" from Google in pyTO…☆55Updated last week
- A unified tokenization tool for Images, Chinese and English.☆152Updated 2 years ago
- Linear Attention Sequence Parallelism (LASP)☆81Updated 10 months ago
- Repository for analysis and experiments in the BigCode project.☆117Updated last year
- ☆34Updated last year