jiahe7ay / infini-mini-transformerLinks
This is a personal reimplementation of Google's Infini-transformer, utilizing a small 2b model. The project includes both model and training code.
☆58Updated last year
Alternatives and similar repositories for infini-mini-transformer
Users that are interested in infini-mini-transformer are comparing it to the libraries listed below
Sorting:
- code for Scaling Laws of RoPE-based Extrapolation☆73Updated last year
- CLongEval: A Chinese Benchmark for Evaluating Long-Context Large Language Models☆40Updated last year
- ☆102Updated 9 months ago
- 1.4B sLLM for Chinese and English - HammerLLM🔨☆44Updated last year
- ☆48Updated last year
- An Experiment on Dynamic NTK Scaling RoPE☆64Updated last year
- SuperCLUE-Math6:新一代中文原生多轮多步数学推理数据集的探索之旅☆59Updated last year
- The complete training code of the open-source high-performance Llama model, including the full process from pre-training to RLHF.☆66Updated 2 years ago
- ☆107Updated last year
- ☆36Updated 10 months ago
- [ICML'24] The official implementation of “Rethinking Optimization and Architecture for Tiny Language Models”☆121Updated 6 months ago
- NTK scaled version of ALiBi position encoding in Transformer.☆68Updated last year
- Research without Re-search: Maximal Update Parametrization Yields Accurate Loss Prediction across Scales