Hannibal046 / nanoRWKV
The nanoGPT-style implementation of RWKV Language Model - an RNN with GPT-level LLM performance.
☆193Updated 10 months ago
Related projects: ⓘ
- A recipe for online RLHF.☆376Updated 3 weeks ago
- MetaMath: Bootstrap Your Own Mathematical Questions for Large Language Models☆372Updated 7 months ago
- ☆189Updated 2 months ago
- [COLM 2024] TriForce: Lossless Acceleration of Long Sequence Generation with Hierarchical Speculative Decoding☆203Updated 2 weeks ago
- This is the official code repository of MoTCoder: Elevating Large Language Models with Modular of Thought for Challenging Programming Tas…☆60Updated 3 weeks ago
- Code and Checkpoints for "Generate rather than Retrieve: Large Language Models are Strong Context Generators" in ICLR 2023.☆276Updated last year
- The official implementation of Self-Play Preference Optimization (SPPO)☆461Updated last month
- Official Code for "SVD-LLM: Truncation-aware Singular Value Decomposition for Large Language Model Compression"☆83Updated last week
- WorldGPT: Empowering LLM as Multimodal World Model☆116Updated last month
- Grimoire is All You Need for Enhancing Large Language Models☆115Updated 6 months ago
- Recipes to train reward model for RLHF.☆634Updated last week
- Evaluating LLMs with Dynamic Data☆66Updated 2 weeks ago
- ☆356Updated 4 months ago
- Benchmarking LLMs via Uncertainty Quantification☆206Updated 7 months ago
- ☆347Updated 3 months ago
- AvaTaR: Optimizing LLM Agents for Tool-Assisted Knowledge Retrieval (https://arxiv.org/abs/2406.11200)☆140Updated last month
- A Comprehensive Benchmark for Code Information Retrieval.☆61Updated last week
- ExtremeBERT is a toolkit that accelerates the pretraining of customized language models on customized datasets, described in the paper “E…☆281Updated last year
- Mathematical Visual Instruction Tuning for Multi-modal Large Language Models☆86Updated last month
- Here we will test various linear attention designs.☆55Updated 4 months ago
- An Extensible Framework for Retrieval-Augmented LLM Applications: Learning Relevance Beyond Simple Similarity.☆42Updated last month
- ☆113Updated last year
- We leverage 14 datasets as OOD test data and conduct evaluations on 8 NLU tasks over 21 popularly used models. Our findings confirm that …☆115Updated last year
- AAGPT is another experimental open-source application showcasing the capabilities of large language models, such as GPT-3.5 and GPT-4.☆154Updated last year
- This is a repo for my NanoGPT Pytorch2.0 Implementation when torch2.0 released soon, faster and simpler, a good tutorial learning GPT.☆59Updated 7 months ago
- The framework to prune LLMs to any size and any config.☆96Updated 6 months ago
- An interpretable large language model (LLM) for medical diagnosis.☆68Updated last week
- [ECCV 2024] Does Your Multi-modal LLM Truly See the Diagrams in Visual Math Problems?☆133Updated 2 weeks ago
- [ICML'24 Oral] The official code of "DiJiang: Efficient Large Language Models through Compact Kernelization", a novel DCT-based linear at…☆95Updated 3 months ago
- RWKV in nanoGPT style☆170Updated 3 months ago